i’m using a Jetson Xavier AGX with Jetpack 4.3 and TensorRT 6.0.
I used the tensorrt C++ API and build a network for DLA 0 and DLA 1 without the Flag GPUFallback and a network for the GPU. The building doesn’t show any errors.
I run the inference for the three engines concurrently.
I expected that the DLA inference won’t be profiled and that the execution would be completly indepent between the DLA 0, DLA 1 and GPU.
Altough following pictures of the profiler was captured. The first thread is for the DLA0, the second for the GPU and the third for DLA1. In the second picture the execution of the DLA1 is visible. Furthermore if i lock at the streams, more kernel executions are visible of the DLA. (stream 63, 88 in picture 3)
My questions are:
- Why is the execution of the DLAs visible even the GPUFallback flag is not set?
- Is there a possibility to overcome these Fallback for completly indepent execution?
- How to get an error if GPU Fallback is not enabled and the DLA can’t execute the layer?