Huke
Introduction_to_LLM_Profiling
2025-09-18
CUPTI is a profiling and tracing api that exposes the hardware counters of NVIDIA GPUs, collects CUDA runtime information and enables developers to build custom profilers. If you used Nsight Compute or Nsight System before, you won’t be unfamiliar with it, because they are built on top of CUPTI, which means you can built whatever Nsight Compute or Nsight System implemented, plus extra features you would like to have.
337 words
|
2 minutes
Building a GPU Profiler From Scratch Using CUPTI
2025-09-03
How to create a fake, customized Nsight Compute
193 words
|
1 minute
Nvidia GPU Basics
2025-09-01
Artificial intelligence is now a part of everyday life, powering everything from search engines to chatbots. But behind the scenes, it takes enormous compute power to train and run large language models (LLMs). This is where GPUs step into the spotlight. Their architecture is uniquely designed to handle the scale of parallelism that modern AI demands. In this post, I’ll introduce some of the core ideas of CUDA programming, Nvidia GPU architecture, and highlight how Nvidia software and hardware work together.
1361 words
|
7 minutes
Markdown Extended Features
2024-05-01
Read more about Markdown features in Fuwari
176 words
|
1 minute
Expressive Code Example
2024-04-10
How code blocks look in Markdown using Expressive Code.
737 words
|
4 minutes
Include Video in the Posts
2023-08-01
This post demonstrates how to include embedded video in a blog post.
61 words
|
1 minute