Using a TX2 NX to build and run a TRT engine. I have made my onnx model and that is being converted into a .trt engine file. I am using a basic MobilenetV2 model here.
The command to build the FP16 model and FP32 model
The inference time for both the models is exactly the same. which basically means both models are the same fp32. am i right?
If so how do i improve my model performance further by moving the model to FP16 precision? because i have a little accuracy to spare rather that compute.
Hi,
Please refer to below links related custom plugin implementation and sample:
While IPluginV2 and IPluginV2Ext interfaces are still supported for backward compatibility with TensorRT 5.1 and 6.0.x respectively, however, we recommend that you write new plugins or refactor existing ones to target the IPluginV2DynamicExt or IPluginV2IOExt interfaces instead.
hey @NVES, I do not think i understand why i want to alter onnx graph cause i have no custom layers its just the standard MobileNetV2.
I need to know if the .trt engine can be made faster by moving to FP16 precision instead of FP32? If so what would the process be?
Which version of the TensorRT are you using?
It’s possible if many layers end up falling back to FP32. TensorRT automatically chooses the best kernel out of available precisions.
Please check the verbose logs. And share with us the ONNX model and verbose logs.
We have not verified on the TX2 NX. Please try the following commands on TX2 NX.
If you still face this issue, we would like to move this post to the TX2 NX forum to get better help.
hey @spolisetty, increasing the workspace has not made any gain this might be because we have already used more than what is needed at max. the tx2 nx gpu doesnt support int8 i believe, because of which there is no time improvements when i ran the build in int8.
hey @AastaLLL yes the tx2 is running at max power and the jetson clocks were running. Unfortunately the jetpack4.6.2 bsp is not available from the carrier board manufacturer. Is there an alternative to this?