"Task=FrankaCabinet": Unable to Complete Task Correctly After 1500 Epochs of Training with Official Default Configuration

Problem Description

Hello,
I am following this tutorial to verify the effect of the “task=FrankaCabinet” task:
https://github.com/isaac-sim/OmniIsaacGymEnvs?tab=readme-ov-file

When attempting distributed training on GPUs, I encountered an unexpected error and the process terminated.

PYTHON_PATH -m torch.distributed.run --nnodes=1 --nproc_per_node=8 scripts/rlgames_train.py headless=True task=FrankaCabinet multi_gpu=True

Therefore, I completed the training using a single RTX 3090 GPU, running the following code (without modifying or configuring any other parameters).

PYTHON_PATH scripts/rlgames_train.py task=FrankaCabinet


I monitored the changes in losses and rewards during training using TensorBoard.

PYTHON_PATH -m tensorboard.main --logdir runs/FrankaCabinet/summaries


Finally, I used the checkpoint at epoch=1500 to test whether the robotic arm could successfully open the cabinet.

PYTHON_PATH scripts/rlgames_train.py task=FrankaCabinet checkpoint=runs/FrankaCabinet/nn/last_FrankaCabinet_ep_1500_rew__2214.7732_.pth test=True num_envs=16


image5

Unfortunately, the performance of the Franka robotic arm is far from what is demonstrated in the official demo. The grippers close prematurely, then move to the illustrated position.

I would like to know how to modify the setup so that the Franka robotic arm can complete this task.

Thank you in advance for your help. Please let me know if additional information is needed.