Nsys cannot collect cuda information on Drive OS 5.1

Please provide the following info (check/uncheck the boxes after clicking “+ Create Topic”):
Software Version
DRIVE OS Linux 5.2.0
DRIVE OS Linux 5.2.0 and DriveWorks 3.5
NVIDIA DRIVE™ Software 10.0 (Linux)
NVIDIA DRIVE™ Software 9.0 (Linux)
other DRIVE OS version
other

Target Operating System
Linux
QNX
other

Hardware Platform
NVIDIA DRIVE™ AGX Xavier DevKit (E3550)
NVIDIA DRIVE™ AGX Pegasus DevKit (E3550)
other

SDK Manager Version
1.6.0.8170
1.5.1.7815
1.5.0.7774
other

Host Machine Version
native Ubuntu 18.04
other

Hello

We cannot use nsight to do cuda profiling on pegasus. We are using drive OS 5.1 and cuda 10.2. Currently we cannot upgrade to Drive OS 5.2 due to compiling issue on 5.2.

I am using Nsight system version: NVIDIA Nsight Systems version 2021.1.1.66-6c5c5cb

A nsys report:

Information Analysis 00:00.000
Profiling has started.
Information Daemon 7628 00:00.001
Process was launched by the profiler, see /tmp/nvidia/nsight_systems/quadd_session_137612/streams/pid_7628_stdout.log and stderr.log for program output
Information Daemon 7628 00:00.001
Profiler attached to the process.
Information Injection 7628 00:00.071
Common injection library initialized successfully.
Information Injection 7628 00:00.144
OS runtime libraries injection initialized successfully.
Information Injection 7628 00:00.311
OpenGL injection initialized successfully.
Warning Analysis 00:10.299
Scheduling information is absent. The thread activity is deduced based on OS runtime libraries traces. This is inacurrate and does not take into account asynchronous interrupts and exception faults.
Information Analysis 7628 00:10.299
Number of NVTX events collected: 483,365.
Warning Analysis 7628 00:10.299
No OpenGL events collected. Does the process use OpenGL?
Warning Analysis 7628 00:10.299
CUDA profiling stopped unexpectedly: Cannot initialize CUDA event collection.
Warning Analysis 7628 00:10.299
No CUDA events collected. Does the process use CUDA?
Information Analysis 7628 00:10.299
Number of OS runtime libraries events collected: 1,626,318.
Information Injection 7628 00:19.019
Buffers holding CUDA trace data will be flushed on CudaProfilerStop() call.
Warning Injection 7628 00:19.148
CUDA injection initialization failed.
Information Injection 7628 00:23.860
NVTX injection initialized successfully.
Information Analysis 04:19.619
Profiling has stopped.

We need this tool for profiling and critical for us to debug gpu problem. Thanks

Dear @shangping.guo,
Is CUDA sample working normally? What is the CUDA driver version?
CUDA profiling stopped unexpectedly: Cannot initialize CUDA event collection.

This error is expected due to different CUDA Driver versions.

If your host has multiple installation of Drive releases(CUDA/devtools), we recommend removing all others to avoid compatibility issues.
Could you share CUDA version/drivers details and DRIVE OS version?

version stated in the post, drive os 5.1 and cuda 10.2. cuda simple samples do not work either with nsys

Dear @shangping.guo,
As I understand, you are trying to profile remote application on target using Nsight systems.

Also please confirm below things

  • Could you share the $ cat /etc/nvidia/version-ubuntu-rootfs.txt output on target?
  • Does the host has multiple cuda versions installed? Please share the output of nvidia-smi
  • Could you check if the compiled CUDA sample is running on target by manually copying the executable to target?

I am running nsys on pegasus directly. I am not sure you mean remote profiling.

nvidia@pegasus2a:~$ cat /etc/nvidia/version-ubuntu-rootfs.txt
5.1.6.1-16902563

nvidia@pegasus2a:~$ nvidia-smi
-bash: nvidia-smi: command not found

For 5.2.0, please use nsight installed onto /opt/nvidia/nsight-systems/2019.5.2 (host or target). Thanks.

For some specific reason, we have to keep 5.1 for now. Thanks

How to check if we have multiple cuda versions installed? Thanks

Thus you better use the nsight systems installed along with your release (DRIVE OS 5.1.6.1 means you installed DRIVE Software 10.0).
Please use the one installed onto /opt/nvidia/nsightsystems/nsightsystems-2019.3.4 folder. Thanks.

I did not see anything except the 2021 version, but I will download and install 2019.3.6, is that OK?

Are you using the host system used to install DRIVE Software 10.0? If not, you can install it again (but skip the flashing part).

we are using cuda 10.2. I cannot install Nsight 2019.3 (seems there is no CLI for arm on the download).
I am new to the GPU stuff, I am not very clear the cuda version vs the drive software version, can you explain?

here is the deviceQuery:
./deviceQuery Starting…

CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 2 CUDA Capable device(s)

Device 0: “Graphics Device”
CUDA Driver Version / Runtime Version 10.2 / 10.2
CUDA Capability Major/Minor version number: 7.5
Total amount of global memory: 7680 MBytes (8052998144 bytes)
(44) Multiprocessors, ( 64) CUDA Cores/MP: 2816 CUDA Cores
GPU Max Clock rate: 1500 MHz (1.50 GHz)
Memory Clock rate: 1440 Mhz
Memory Bus Width: 256-bit
L2 Cache Size: 4194304 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
Maximum Layered 1D Texture Size, (num) layers 1D=(32768), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(32768, 32768), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 1024
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and kernel execution: Yes with 3 copy engine(s)
Run time limit on kernels: No
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Disabled
Device supports Unified Addressing (UVA): Yes
Device supports Compute Preemption: Yes
Supports Cooperative Kernel Launch: Yes
Supports MultiDevice Co-op Kernel Launch: Yes
Device PCI Domain ID / Bus ID / location ID: 1 / 1 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

Device 1: “Xavier”
CUDA Driver Version / Runtime Version 10.2 / 10.2
CUDA Capability Major/Minor version number: 7.2
Total amount of global memory: 27924 MBytes (29280559104 bytes)
( 8) Multiprocessors, ( 64) CUDA Cores/MP: 512 CUDA Cores
GPU Max Clock rate: 1109 MHz (1.11 GHz)
Memory Clock rate: 1109 Mhz
Memory Bus Width: 256-bit
L2 Cache Size: 524288 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
Maximum Layered 1D Texture Size, (num) layers 1D=(32768), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(32768, 32768), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and kernel execution: Yes with 1 copy engine(s)
Run time limit on kernels: No
Integrated GPU sharing Host Memory: Yes
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Disabled
Device supports Unified Addressing (UVA): Yes
Device supports Compute Preemption: Yes
Supports Cooperative Kernel Launch: Yes
Supports MultiDevice Co-op Kernel Launch: Yes
Device PCI Domain ID / Bus ID / location ID: 0 / 0 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

Peer access from Graphics Device (GPU0) → Xavier (GPU1) : No
Peer access from Xavier (GPU1) → Graphics Device (GPU0) : No

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 10.2, CUDA Runtime Version = 10.2, NumDevs = 2
Result = PASS

Please check “6.3 Software Versions” in DRIVE Software 10.0 Release Notes.
The software verions are what were verified with the release.

Also, NVIDIA Nsight Systems User Guide is for the version.

My host machine is ARM, it seems SDKManager cannot be installed. Any way to install drive software 10.0?
see the requirement: NVIDIA SDK Manager | NVIDIA Developer

Please see Getting Started with DRIVE Software :: NVIDIA DRIVE AGX System Installation and Setup. It supports only X86_64.

Sorry Vicky, you got me lost, then what I can do?

You need to have a x86_64 host system with Ubuntu Desktop 18.04 LTS to run SDK Manager and install packages in the release.

I have this machine but it is not connected to the pegasus, will it be good? (pegasus is ARM + xavier)

actually I just did installation on my x86 machine, but it failed
image

Dear @shangping.guo,
It looks like apt-get is not in stable state. Could you please fix the the issues with apt before trying flashing the target?