Optimize caffemodel to run faster on Jetson TX2

I am trying to speed up inference time of OpenPose on the the Jetson TX2.

  • OpenPose - https://github.com/CMU-Perceptual-Computing-Lab/openpose
  • Built OpenPose and Caffe using the following script - https://github.com/CMU-Perceptual-Computing-Lab/openpose/blob/master/doc/installation_jetson_tx2_jetpack3.3.md

Unfortunately, inference time is ~1.5 FPS.

I am trying to use TensorRT, for example, to convert the model to FP16 to leverage NVIDIA Tensor Cores. Though, it is said that TensorRT python API is not supported on Jetson platform due to pyCUDA, so it’s not possible to write a python script that takes a ‘.caffemodel’ file and optimize it.

Some more specs:

  • TensorRT version:
  • L4t version: 8.2
  • Ubuntu 16.04

Are there any tools/scripts available that can optimize a caffe model given it’s ‘.caffemodel’ and ‘.ptototxt’ to run faster on the Jetson TX2?


Due to the Jetson’s constraints, you can optimize your model on another device. Transfer generated plan file to Jetson and simply create a C++ inference engine by referencing this (https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html#c_topics) guide.

I hope this help you.