Parallel processing in python

I am using computer vision with trt_pose from here GitHub - NVIDIA-AI-IOT/trt_pose: Real-time pose estimation accelerated with NVIDIA TensorRT running on a Jetson Xavier NX as standalone on the edge on a robot. The robot has some hobby servos and dynamixel servos that the jetson controls to move the robot based upon the output from the trt_pose model.

The main loop basically can be broken in to 3 parts:

  1. grab the frame from gstreamer and preprocess the frame ready for the model.
  2. the model processes the frame (inference)
  3. post processing of the model’s output to move the robot in the real world.

I have timed each part and get:
part 1 is 0.006 sec, part 2 is 0.017 sec and part 3 is 0.022

Part 3 is the slowest part of the loop yet computationally is the lightest. Timing each step of part 3 I find sending the data (a dozen bytes) to the dynamixel servos via the USB port at a baud rate of 1Mbp is what is holding up the process. I assume it is python or the OS is holding it up here and everything is sitting idle waiting on this process.

Part 1 and part 3 only run on the CPU on a single core and Part 2 runs on all 6 CPU core and all GPU cores.

What I would like to do in python is run Part 3 in parallel with part 1 and part 2, i.e once part 2 finishes it goes back to part 1 and part 3 runs in parallel. What’s the best way to achieve this???


A simple implementation is to keep part 1 and part 2 executed on the main thread.
Once part 2 is finished, create a new thread to calculate part 3’s result.

Since only CPU is used, this can be implemented with a standard python module, ex. threading.

Thanks this worked well.

It basically doubled my frame rates.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.