TensorRT - NVTX Filtering using Nsight Systems

juliefraysse · May 6, 2021, 7:45am

Hi,

Best Practices For TensorRT Performance :: NVIDIA Deep Learning TensorRT Documentation

Decoding the kernel names back to layers in the original network can be complicated. Because of this, TensorRT uses NVTX to mark a range for each layer, which then allows the CUDA profilers to correlate each layer with the kernels called to implement it. In TensorRT, NVTX helps to correlate the runtime engine layer execution with CUDA kernel calls. Nsight Systems supports collecting and visualizing these events and ranges on the timeline. Nsight Compute also supports collecting and displaying the state of all active NVTX domains and ranges in a given thread when the application is suspended

When profiling a TensorRT application, it is recommended to enable profiling only after the engine has been built. During the build phase, all possible tactics are tried and timed. Profiling this portion of the execution will not show any meaningful performance measurements and will include all possible kernels, not the ones actually selected for inference. One way to limit the scope of profiling is to:

First phase
Structure the application to build and then serialize the engines in one phase.

Second phase
Load the serialized engines and run inference in a second phase.

Third phase
Profile this second phase only.

As recommended, I would like to do NVTX filtering during profiling collection.

I look at Nsight Systems CLI options.

Nsight Systems User Guide :: Nsight Systems Documentation (nvidia.com)

I noticed this option :

--nvtx-capture
Parameters
range@domain,range,range@
Description
Specify NVTX capture range. See below for details. This option is applicable only when used along with --capture-range=nvtx.`

In order to do that, I would like to know which domain name and range name I should use in my the Nsight CLI command line (The ones TensorRT uses when creating NVTX events).

Thx,

juliefraysse · June 12, 2021, 1:51pm

Do you have any news ?

Topic		Replies	Views
NVTX Filtering TensorRT tensorrt , nsight	1	817	May 6, 2021
NVTX Filtering Profiling Linux Targets nsight	1	463	August 27, 2021
How to profile only a portion of TensorRT application with Nsight Systems? TensorRT tensorrt , nsight	3	1726	October 12, 2021
How to capture only the NVTX Domain of TensorRT engine in Nsight? TensorRT tensorrt , nsight	1	639	April 1, 2021
How to capture only the NVTX Domain of TensorRT engine in Nsight? Profiling Linux Targets nsight	0	403	April 1, 2021
Speed up or measure progress of the network profiling/building phase TensorRT	3	484	May 24, 2022
How to measure Tensor core utilization using NVIDIA profiling tools such as Nsight System, DLProf, nvprof etc TensorRT cudnn	4	1636	January 31, 2024
Getting layer specific kernel metrics in a DL application Nsight Compute	6	616	February 20, 2020
TensorFlow Performance Logging Plugin nvtx-plugins-tf Goes Public Technical Blog	1	463	January 23, 2022
Profile pytorch model using NCU TensorRT ubuntu , pytorch , python	1	1211	July 1, 2022

TensorRT - NVTX Filtering using Nsight Systems

Related topics