iGPU and dGPU role in Drivepx2.

navaznazar · February 20, 2018, 11:16am

Hi,

How to use both discrete and integrated gpu’s effectively, during run time in Tegra?
For example, I have an image application where a part of pre-processing and actual processing can happen parallely.
Whether it is possible to run the application separately on iGPU and DGPU respectively?

According to my understanding at a time only one context will given to a device.So if we want to utilize second device then we need to create another context with new device.
If yes whether switching between the context will degrade the performance?.

Also incase if the application wants to transfer data from iGPU to DGPU whether it is feasible because I don’t see such API’s or it should be like iGPU->Host and then Host->DGPU?

Another question is on the same note whether it is possible to use both Tegra A and Tegra B parallely at a given time.
If yes Kindly provide some link or samples path if any available in Driveworks SDK.

Kindly clarify on the above points and correct me if my understanding is wrong.

SivaRamaKrishnaNV · February 20, 2018, 5:44pm

Dear navaznazar,
You can check if you can create 2 threads, one thread holds context on iGPU and another holds context on dGPU. Then you can use both GPUs at a time.
There are no APIs to transfer data directly from iGPU to dGPU. However, you can use EGLStream to transfer frames from iGPU to dGPU without additional memcpy to host memory. Please check if EGLStream can be used in your use case.
The Tegra A and Tegra B are like seperate system connected via ethernet. You can run different appplications on Tegra A and Tegra B at a time.

jinlingge · March 27, 2018, 5:12am

Dear SivaRamaKrishna,

I am wondering if Tegra A and Tegra B can be used at the same time for one single Neural Network. I mean, we want to utilize the capability of all the iGPUs and dGPUs on the device, ideally, four GPUs can be used to running inference for our NN model. Is this possible? If so, would you please point us to the right documentations that illustrate how this is achieved?

We understand that NCCL2.0 would support multiple machines and multiple GPUs, and the software we’ve found is only for amd64 but not for ARM architecture. Is there a version of NCCL specifically for PX2 so that we can make full use of all the GPUs for inference tasks on board? If not, is this feature on your roadmap in future?

Thanks in advance and looking forward to your reply.

Cheers,
Lingge

SivaRamaKrishnaNV · March 27, 2018, 8:16am

Dear jinlingge,
NCCL is not supported on DrivePX2 platform. If you want to use 4 GPUs for a single network, The network needs to be distributed across 4 GPUs which results in more data transfers across PCIe. Instead, you can choose one network per GPU and run multiple networks in parallel. Also, We provide TensorRT library to optimize network models on DrivePX2. You can check that also.
Please let us know if you have any use case to use NCCL for inference.

Topic		Replies	Views
Can I use the iGPU and dGPU at same time? DRIVE Hardware	1	1602	April 26, 2018
Running code on iGPU vs dGPU DRIVE AGX Xavier General	9	2039	October 12, 2021
Multiple GPU computing CUDA Programming and Performance	8	7944	May 7, 2008
Drive PX 2 inference performance General	3	2698	October 24, 2017
Why two tegra CPUs were used in DRIVE PX 2? General	9	2317	November 10, 2018
OpenGL / Multiple GPUs CUDA Programming and Performance	5	7331	April 15, 2009
Does TensorRT support multi-gpu inference? TensorRT	2	1201	June 25, 2018
Multiple threads using single Tesla CUDA Programming and Performance	3	3801	March 27, 2009
Computation on Tesla and display results using another device CUDA Programming and Performance	4	6550	January 20, 2010
Multi-GPU dot product CUDA Programming and Performance	7	1139	March 6, 2012

iGPU and dGPU role in Drivepx2.

Related topics