Introduction_to_LLM_Profiling
CUPTI is a profiling and tracing api that exposes the hardware counters of NVIDIA GPUs, collects CUDA runtime information and enables developers to build custom profilers. If you used Nsight Compute or Nsight System before, you won’t be unfamiliar with it, because they are built on top of CUPTI, which means you can built whatever Nsight Compute or Nsight System implemented, plus extra features you would like to have.
337 words
|
2 minutes
Building a GPU Profiler From Scratch Using CUPTI
How to create a fake, customized Nsight Compute
193 words
|
1 minute
Nvidia GPU Basics
Artificial intelligence is now a part of everyday life, powering everything from search engines to chatbots. But behind the scenes, it takes enormous compute power to train and run large language models (LLMs). This is where GPUs step into the spotlight. Their architecture is uniquely designed to handle the scale of parallelism that modern AI demands. In this post, I’ll introduce some of the core ideas of CUDA programming, Nvidia GPU architecture, and highlight how Nvidia software and hardware work together.
1361 words
|
7 minutes
Markdown Extended Features
Read more about Markdown features in Fuwari
176 words
|
1 minute
Expressive Code Example
How code blocks look in Markdown using Expressive Code.
737 words
|
4 minutes
Simple Guides for Fuwari
How to use this blog template.
160 words
|
1 minute
Cover Image of the Post
Markdown Example
A simple example of a Markdown blog post.
444 words
|
2 minutes
Include Video in the Posts
This post demonstrates how to include embedded video in a blog post.
61 words
|
1 minute