Nsys CLI Issues in DeepStream container

• Hardware Platform (Jetson / GPU)
RTX 5000, x86-64 RHEL
• DeepStream Version
6.0 (nvidia_deepstream:6.0-devel) & 6.1 (nvidia_deepstream:6.1-devel)
• NVIDIA GPU Driver Version (valid for GPU only)
470.57.02 & 510.47.03
• Issue Type( questions, new requirements, bugs)
Possible bug

Hi, I am attempting to use the Nsight Systems CLI (nsys) inside the stock DeepStream 6.0 or 6.1 development containers released by Nvidia. I started testing nsys with the very latest release and stepped back each release until I found a version that worked inside the DeepStream containers. Here are the results:

nsight-systems-2021.2.4_2021.2.4.12-1_amd64.deb - works great
nsight-systems-2021.3.2_2021.3.2.4-1_amd64.deb - nsys: /opt/vulkan/build/Vulkan-Loader/loader/loader.c:5517: loader_layer_create_device: Assertion pCreateInfo->queueCreateInfoCount >= 1' failed. [nsight-systems-2021.3.3_2021.3.3.2-1_amd64.deb](https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/nsight-systems-2021.3.3_2021.3.3.2-1_amd64.deb) - nsys: /opt/vulkan/build/Vulkan-Loader/loader/loader.c:5517: loader_layer_create_device: Assertion pCreateInfo->queueCreateInfoCount >= 1’ failed.
nsight-systems-2021.5.2_2021.5.2.53-1_amd64.deb – will not terminate application after specified delay
nsight-systems-2022.1.3_2022.1.3.3-1_amd64.deb – will not terminate the application after specified delay
NsightSystems-linux-cli-public-2022.1.1.61-1d07dc0.deb (latest from downloads) - will not terminate the application

To test this compile the Nvidia sample deepstream-app in the container and run:
nsys profile --wait all --gpu-metrics-set --trace=cuda,cudnn,nvtx,osrt,opengl --delay=10 --duration=2 ./deepstream-app path to config

Why is it necessary to revert back to nsys CLI version 2021.2.4 in order to successfully collect a profile in the DeepStream container? This appears consistent with both 6.0 and 6.1 development containers

@rknight can you start the triage on this?

Thanks for reporting these issues streamer_g99.

Do you remember what value you set the --gpu-metrics-set switch to?

Robert

Hi Robert,

This is the command I used, you can test with the sample DeepStream app in the dev container:
nsys profile --wait all --gpu-metrics-set --trace=cuda,cudnn,nvtx,osrt,opengl --delay=10 --duration=2 ./deepstream-app path to config

Flags are:
–wait all
–gpu-metrics-set
–trace=cuda,cudnn,nvtx,osrt,opengl

The documentation for --gpu-metrics-set says:
If not specified, the default the first metric set that supports all selected GPUs.

Seems to work for the version specified version above but none of the other releases. Is leaving this value empty a problem?

Hi streamer_g99,

nsys kills the app launched (by nsys) using the sigterm signal at the end of the collection. If the app catches that signal, it might not be terminated.

Check out the nsys CLI’s help description for the --kill switch. Can you try adding ‘–kill=sigkill’ to see if that resolves the issue?

Robert

Hi Robert,

I downloaded the latest Nsys CLI (2022.3.4.34-133b775) and tried adding the --kill=sigkill but that did not seem to make a difference. Are you seeing different behavior?
Are there any plans to include nsys CLI pre-installed in the DeepStream dev container in future releases? This would be very helpful for analysis. Currently, only 2021.2.4 seems to be compatible with DeepStream 6.0/6.1