I have a model that I’m trying to convert to an Int8, DLA TensorRT engine. I was able to successful convert the model on JetPack 5.0.1 but on JetPack 5.1.2, I’m facing few issues.
I’m able to generate FP16, FP16 with DLA support and INT8 engine when NOT providing the calibration cache file (–cache option in trtexec)
When trying to make an INT8 engine with DLA support and NOT giving the calibration cache file, the process got stuck(it remained stuck even after 24hrs). I tried couple of different workspace sizes ranging from 150MB to 3500MB but the issue remained.
When trying to make an INT8 engine with DLA support and giving the calibration cache file, I get Error[2]: [weightConvertors.cpp::quantizeBiasCommon::337] Error Code 2: Internal Error (Assertion getter(i) != 0 failed. ).
Could you please help me in converting this model. I’m not sure how the calibration cache file works. Would replacing the 0s with something like 837: 00000001 or 01010101 work, or would that have too much impact on accuracy?
Thanks in advance!
NOTE: The code to generate the calibration file has has been used since JetPack 4.6.0.
Environment
TensorRT Version: 8.5.2 Device: Nvidia Jetson Xavier NX JetPack: 5.1.2 CUDA Version: 11.4 Operating System + Version: 20.04 Ubuntu
Thanks for the reply!
I’ve been through the links you have mentioned. My main issue is that the same model and code is working on JetPack 5.0.1 but not on JetPack 5.1.2 so I was hoping for some assistance.
EDIT: The post has been moved to the correct forum.
Regarding 2, in the post I’ve attacked the log file which was the output with the --verbose flag.
Regarding 3, I had generated the calibration cache file on the nx with jetpack 5.1.2 itself. As mentioned, I followed the same steps I followed on jetpack 5.0.1 but it didn’t work. (also attached in the post).
Thanks.
Hi,
Apologies for any confusion. I encountered an error in the third scenario (INT8 + DLA with calibration cache file). In the second scenario (INT8 + DLA without providing a cache file), the program became unresponsive, and after waiting for over 25 hours, I had to terminate it. The log file captures events up to the termination point.
Do you wish to see the log file for the third scenario as well?
We have confirmed that the syncpoint timeout issue (TensorRT gets stuck) won’t happen on Orin’s DLA2.
Since our internal team doesn’t support DLA1’s issue anymore, please use GPU for inference instead.
We give a further test with the calibration file + GPU mode, and the same error also occurs.
Since GPU mode can run successfully without the calibration file, it looks more like the issue comes from the calibration file itself.