Pose Estimation Training and inference

Here are the list of Issues that we faced while training a PoseCNN in Isaac. Some were already solved the Nvidia team, and some remain open with some suggested solutions.

Problem: The bounding box was always spawned in the center.


  • We have to use the non-target randomization.
  • When we don’t train with the Camera – Target randomization, the pose and segmentation is acceptable for objects at the center of the image but not when moving to the sides

**Problem: When changing the target randomization, the rotation estimation is not always accurate and it is position dependent. **

We have trained with camera randomization and target randomization and the Pose is not being properly outputted. The camera ranges trained are quite big and since the rotation and translation regressors take the bounding box features as inputs, the training space is quite large.

Training parameters and configuration used:

Learning rate 9e-4
Number of iterations 7k (the losses saturated)
Camera randomization parameters (check attached Image)

Suggested Solution:

  • By changing the randomization midway during training, it can be should that the loss jumps, since the camera and the object position is different. Therefore it is suggested to restrict the camera randomization to a known operation range and use Multi-sim training try to cover the restricted range with as much data as possible.

Problem: Training time

Another big issue is that in order for you to reach the 7k iterations is taking you around 2/3 days. It is suggested that the issue is the framerate.

Suggested Solution:

Multi-sim training.

Question: Is there a way that we can remove the bounding box features from the rotation regressor. I think that this might be a problem when the object is not in the center of the image. This problem can be solved by training with alot of data such that the regressor is trained extensively over the whole space or by removing the bounding box features from the rotation regressor.

Thank you.

Hi Mohamed,

What is your workstation GPU for training this model?


I have 2 Quadro RTX 6000 but the training is using only one of them. Is there a possibility to use both of them?

I have alot of downtime in the training GPU. I’m using now a multiSim training with about 80 images rx rate