I have a proprietary network, which I can not share. The final output is a 3x2 numbers.
When I run it on the GPU, it works fine. Running on the DLA gives 4 correct values and the last two are NaN.
Any idea how to debug it? must be a bug in the DLA/driver?
Please noted that not all the TensorRT layer are supported by the DLA.
You can find some details in the below document:
There are some limitation in the configuration, like padding type, kernel region, … .
The most common issue is that some configuration is out of DLA support.
And somehow DLA doesn’t reports or fallback to the GPU.
Hi @AastaLLL ,
Thanks for the answer. I’ll try to reproduce it in a public network and/or pinpoint which layer is causing this.
Based on what you answer, I guess this is a bug since it should have refused to run on the DLA/fall back to the GPU or at least report an issue/error?
We do get a similar issue when running batch inference on DLA.
And it is fixed internally and will be available in our next JetPack major release.