half float can't accelerate in tensorRT.

Hi,

I just use tensorRT instead of caffe framework. And I have written a plugin for PReLU, and when I test fp32 and fp16. the speed was not much different.

Thanks.

Hi,

Could you share more information about your use case?

Here are some initial suggestions:

1. Please remember to maximize the TX2 performance first.

sudo ./jetson_clocks.sh

2. It’s recommended to use TensorRT profiler to figure out the bottleneck layer.
Please check our native sample for information:
/usr/src/tensorrt/samples/sampleGoogleNet/sampleGoogleNet.cpp

3. FP16 cuts memory in half but not always double the performance.
The time to process a specific layer (Ex. IP layer) may be longer in FP16 mode.
It is encouraged to compare the performance between FP16 and float.

Thanks.