How to use dGPUs

Please provide the following info (check/uncheck the boxes after clicking “+ Create Topic”):
Software Version
DRIVE OS Linux 5.2.0
DRIVE OS Linux 5.2.0 and DriveWorks 3.5
NVIDIA DRIVE™ Software 10.0 (Linux)
NVIDIA DRIVE™ Software 9.0 (Linux)
other DRIVE OS version
other

Target Operating System
Linux
QNX
other

Hardware Platform
NVIDIA DRIVE™ AGX Xavier DevKit (E3550)
NVIDIA DRIVE™ AGX Pegasus DevKit (E3550)
other

SDK Manager Version
1.6.0.8170
1.5.1.7815
1.5.0.7774
other

Host Machine Version
native Ubuntu 18.04
other

Hi guys,
As we know, there are two GPUs inside AGX, internal GPU (weak one) and external PCI GPU (strong one).
Is there document and example code to show me how to use those two GPU separately?

Another question is tegrastats command showing GPU usage always 0%. Do we have any other validated GPU monitor tools?

$ sudo tegrastats 
08-04-2021 18:27:29 RAM 4526/28305MB (lfb 140x4MB) CPU [0%@1817,0%@1817,0%@1817,0%@1817,0%@1817,0%@1817,0%@1817] EMC_FREQ @2133 GR3D_FREQ 0%@1109 APE 245 GR3D_PCI % 0% AUX@33C CPU@32.5C Tdiode@34.25C AO@33C GPU@32C tj@43.5C
08-04-2021 18:27:30 RAM 4526/28305MB (lfb 140x4MB) CPU [0%@1817,0%@1817,0%@1817,0%@1817,0%@1817,0%@1817,0%@1817] EMC_FREQ @2133 GR3D_FREQ 0%@1109 APE 245 GR3D_PCI % 0% AUX@33C CPU@32.5C Tdiode@34.25C AO@33C GPU@32.5C tj@43.5C
08-04-2021 18:27:31 RAM 4526/28305MB (lfb 140x4MB) CPU [0%@1817,0%@1817,0%@1817,0%@1817,0%@1817,0%@1817,0%@1817] EMC_FREQ @2133 GR3D_FREQ 0%@1109 APE 245 GR3D_PCI % 0% AUX@33C CPU@32.5C Tdiode@34.25C AO@33C GPU@32.5C tj@43.5C

Dear @Peter_Pertrili,
If you use CUDA directly, You can use cuda set device(CUDA Runtime API :: CUDA Toolkit Documentation) or you can set environment variable CUDA_VISIBLE_DEVICES to 0( for dGPU) and 1(for dGPU).

Tegrastats shows only iGPU utilization. You may use other profiling tools like nsys for dGPU

What is iGPU?
Is there any document for nsys? For example how to install it and how to use it?

Is there some simple example code to do dGPU testing?

Dear @Peter_Pertrili,
iGPU means integrated GPU which is part of Xavier SoC. dGPU means discrete GPU which is connected via NVLink.

Nsight systems( nsys) comes with DRIVE release and should present at /usr/local/cuda/bin/. Make sure your system has only one CUDA version that gets installed with DRIVE release which is flashed on the target. Please check NVIDIA Documentation for more details

CUDA_VISIBLE_DEVICES to 0( for iGPU) and 1(for dGPU).
I think we should update those comments with below, is that correct?

I only install Driver OS5.2 without DriverWorks. Does that mean nsys is included with DrvierWorks?

# The result of my own AGX
/usr/local/cuda-10.2/bin$ ls
cuda-gdb  cuda-gdbserver  cuda-memcheck  cuobjdump  nvdisasm

Dear @Peter_Pertrili,
In general, on a multi GPU system, GPU device numbers are enumerated in the order of their performance. You can check the order by running CUDA DeviceQuery sample.
On DRIVE AGX Pegasus, 0 means dGPU and 1 means iGPU.

Could you confirm if the nsys was checked on Host?

For DRIVE OS 5.2.0, it’s installed onto /opt/nvidia/nsight-systems/2019.5.2 . Thanks.

1 Like

What the type of dGPU for AGX? Where I can found the document for this dGPU info and AGX block diagram?

Another question is can we use jtop help for AGX?
[

Dear @Peter_Pertrili,
DRIVE AGX has volta or Turing based dGPU in DRIVE AGX Pegasus plaform.

Does this CUDA for Tegra :: CUDA Toolkit Documentation helps?

For dGPU spec, could you cross compile CUDA deviceQuery ( CUDA Samples :: CUDA Toolkit Documentation) sample and run on target.

Please check CUDA Samples :: CUDA Toolkit Documentation for more details on cross compilation