Nsys cli cannot trace cuda

hi,

I’m using nsight system cli with version

$ nsys --version
NVIDIA Nsight Systems version 2022.2.1.31-5fe97ab

But when I use -t cuda, FATAL ERROR occured and qdstrm is broken.

nvidia@tegra-ubuntu:/usr/local/cuda/samples/0_Simple/vectorAdd$ nsys profile --cudabacktrace=all -t cuda,cudnn,nvtx,mpi --output=./ ./vectorAdd
WARNING: ARMv8 PMU is not available, enabling `sampling-trigger=perf` switch, software events will be used for CPU sampling.
[Vector addition of 50000 elements]
Copy input data from the host memory to the CUDA device
CUDA kernel launch with 196 blocks of 256 threads
Copy output data from the CUDA device to the host memory
Test PASSED
Done
Generating '/tmp/nsys-report-6178.qdstrm'
FATAL ERROR: /build/agent/work/20a3cfcd1c25021d/QuadD/Common/GpuTraits/Src/GpuTicksConverter.cpp(376): Throw in function QuadDCommon::TimestampType GpuTraits::GpuTicksConverter::ConvertToCpuTime(const QuadDCommon::Uuid&, uint64_t&) const
Dynamic exception type: boost::exception_detail::clone_impl<QuadDCommon::NotFoundException>
std::exception::what: NotFoundException
[QuadDCommon::tag_message*] = No GPU associated to the given UUID

wil not happen when cuda is not in trace list

nvidia@tegra-ubuntu:/usr/local/cuda/samples/0_Simple/vectorAdd$ nsys profile --cudabacktrace=all -t cudnn,nvtx,mpi --output=./ ./vectorAdd
WARNING: ARMv8 PMU is not available, enabling `sampling-trigger=perf` switch, software events will be used for CPU sampling.
WARNING: CUDA backtraces will not be collected because CUDA tracing is disabled.
[Vector addition of 50000 elements]
Copy input data from the host memory to the CUDA device
CUDA kernel launch with 196 blocks of 256 threads
Copy output data from the CUDA device to the host memory
Test PASSED
Done
Generating '/tmp/nsys-report-1936.qdstrm'
Failed to create '/usr/local/cuda-11.4/samples/0_Simple/vectorAdd/./.nsys-rep': Permission denied.
[1/1] [========================100%] nsys-report-dc7e.nsys-rep
Generated:
    /tmp/nsys-report-dc7e.nsys-rep

The .qdstream to .nsys-rep transformation failed. Can you try loading the .qdstrm file into the same GUI version as the CLI you were running. Most often the transformation fails when a library is absent on the target device.

qdstrm seems broken as well

@Andrey_Trachenko can you have someone look into this?

I just found nsys local not compatible with this device, remote connect working fine. thanks.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.