First, I’d like to say how great the trt_pose git project is.
Working on a TX2 (8Gb), I am seeing inference time of around 10ms.
Weirdly when saving the model to onnx and loading it into an c++ tensorrt project (based on the mnist onnx sample) I am seeing inference time go up to 40+ms. 4 times worse
Any way to deploy the model to c++ without the performance sacrifice?
Ubuntu 18.04
Jetpack 4.2.2
TensorRT 5.1.1
CUDA 10.0.326
cuDNN 7.5.0