I have been using Xavier AGX for a while. Recently we are deploying shufflenet on it, and I noticed that DLA is giving me completely wrong result:
The model is basically a customer trained shufflenet v2, from
We converted into onnx and load into TensorRT(C++, nvinfer) to run. I can run it correctly with GPU, but if i run it with DLA, it gives completely wrong result. (images that gives 0.98 score on GPU turns out to be 0.001 on DLA).
I am sure the other part of the system is working correctly as I get same(but wrong) results from DLA if I run the same input multiple times; I also get different result for different input. Therefore, I am pretty sure the input, output is setup correctly.
Can we fully trust the engine building part of tensorRT to only schedule what DLA can successfully run on DLA?