Demangled names references

kunal.sahoo2003 · January 16, 2025, 6:49am

Can someone help me out in understanding how to interpret the demangled names of the kernels obtained while profiling DNN workloads using NCU and how to map them back to my original PyTorch/ONNX workloads?

veraj · January 16, 2025, 8:03am

Hi, @kunal.sahoo2003

There is an option --print-kernel-base can be used during profile. The value can be mangled/function/demangled

felix_dt · January 16, 2025, 8:24am

Kernels launched from these frameworks generally have generic names and are specialized via their template arguments or parameters. To see them in context, you can collect NVTX (using --nvtx) and/or Python call stacks (using --call-stack-type python). Whether or not NVTX collects range information depends on the app/framework being instrumented with it. If yours is not, you can add ranges around your workloads yourself. Note that multiple kernels will likely map to a high-level part of your application.

kunal.sahoo2003 · January 16, 2025, 10:07am

sudo ncu --target-processes all --set roofline --call-stack-type python -f -o results/ncu-reps/lstm_infer_bs_1_20epoch_debug bash exp_script.sh

yields the error:
==ERROR== unrecognised option ‘–call-stack-type’. Use --help for further details.

kunal.sahoo2003 · January 16, 2025, 10:15am

Does NCU with NVTX help me visualize which kernel is associated with which layer in my PyTorch Neural Network?

That is my main objective, to understand which kernel belongs to which high level deep learning layer in a DNN workload.

felix_dt · January 16, 2025, 10:22am

Topic		Replies	Views
Question about profiling kernel using ncu Nsight Compute kernel	4	779	April 27, 2022
Mangled kernel name in nsight compute metrics output Nsight Compute	2	701	January 20, 2023
NVTX Filtering Profiling Linux Targets nsight	1	454	August 27, 2021
Mangled kernel names in raw file Nsight Compute	3	321	June 23, 2023
TensorRT - NVTX Filtering using Nsight Systems Nsight Systems	1	796	June 12, 2021
NVTX Filtering TensorRT tensorrt , nsight	1	807	May 6, 2021
Profiling fails on more than one gpu device Nsight Compute	9	914	November 15, 2023
About using ncu to profile the python code, which further called cu kernels Nsight Compute	13	818	June 15, 2024
Nsight compute failed to profile with nvtx ranges in pytorch Nsight Compute pytorch , profiling	4	991	September 19, 2023
Getting layer specific kernel metrics in a DL application Nsight Compute	6	612	February 20, 2020

Demangled names references

Related topics