The NVIDIA® CUDA Profiling Tools Interface (CUPTI) is a dynamic library that enables the creation of profiling and tracing tools that target CUDA applications. CUPTI provides a set of APIs targeted at ISVs creating profilers and other performance optimization tools:
- the Activity API,
- the Callback API,
- the Event API,
- the Metric API, and
- the Profiler API
Using these CUPTI APIs, independent software developers can create profiling tools that provide low and deterministic profiling overhead on the target system, while giving insight into the CPU and GPU behavior of CUDA applications.
CUPTI for CUDA Toolkit 11.4 includes these improvements:
- Profiling APIs support profiling of the CUDA kernel nodes launched by a CUDA Graph. Auto range profiling with kernel replay mode and user range profiling with user replay and application replay modes are supported. Other combinations of range profiling and replay modes are not supported.
- Added sample profiling_injection to show how to build injection library using the Profiling API.
- Added sample concurrent_profiling to show how to retain the kernel concurrency across streams and devices using the Profiling API.
For more information on CUPTI for CUDA Toolkit 11.4 , including features, requirements, documentation and support, please visit the CUPTI Overview page .
To download this version, get it as part of the CUDA Toolkit 11.4 .