Trt_pose not utilizing the GPU on jetson Tx2

Hello All,

We are trying to develop an application which uses trt_pose, the model is too slow and the max we could extract was 15fps on Jetson Tx2, after nvpmodel and jetson_clocks. Also, I observed that through tegrastats, the code is not utilizing all the cores, and I believe the GPU cores have zero usage during runtime, even though it is enabled inside the code. Can you please help me in getting a boost in the performance? Thank you.!
Attaching tegrastats for reference.