Secret Lab - A place for my experiment reports

2025-09-18

LLM

/

CUPTI

/

GPU

CUPTI is a profiling and tracing api that exposes the hardware counters of NVIDIA GPUs, collects CUDA runtime information and enables developers to build custom profilers. If you used Nsight Compute or Nsight System before, you won’t be unfamiliar with it, because they are built on top of CUPTI, which means you can built whatever Nsight Compute or Nsight System implemented, plus extra features you would like to have.

337 words

|

2 minutes

Building a GPU Profiler From Scratch Using CUPTI

2025-09-03

Computer Architecture

GPU

/

Note

How to create a fake, customized Nsight Compute

193 words

|

1 minute

Nvidia GPU Basics

2025-09-01

Computer Architecture

GPU

/

Note

Artificial intelligence is now a part of everyday life, powering everything from search engines to chatbots. But behind the scenes, it takes enormous compute power to train and run large language models (LLMs). This is where GPUs step into the spotlight. Their architecture is uniquely designed to handle the scale of parallelism that modern AI demands. In this post, I’ll introduce some of the core ideas of CUDA programming, Nvidia GPU architecture, and highlight how Nvidia software and hardware work together.

1361 words

|

7 minutes

Markdown Extended Features

2024-05-01

Examples

Demo

/

Example

/

Markdown

/