I’m trying to profile a model in PyTorch. The model is traced into a ScriptModule, and run on GPU from python script. I’m using CentOS, and the py file is compiled as a par. I tried to use the --target-processes all option, but still get a warning that “No kernels were profiled”.
example of my cmd:
ncu -o profile test.par --target-processes all