Image Segmentation using UNet + TensorRT

Hi, I am trying to do image segmentation using TensorRT and UNet by following the instruction from this link:
However, after I run “./unet -d image.jpg”, it return the result of 64x64 input image after roughly 2s. Are there any ways to increase the speed/performance of this task on Jetson TX2? I expect to receive something lower than 0.9s. Thank you.


Do you use fp32 or fp16 to get the 2s performance?
If fp32 is used, please give fp16 a try.

More, you can boost the device performance with the following command:

$ sudo nvpmodel -m 0
$ sudo jetson_clocks