During repeated tests, the first, second, third "reward" are different from the others

DDPG7 · December 1, 2021, 12:56am

Hi

The first 〜third “reward” are different from the others.
Is this a bug?

RunningMeanStd: (18,)
=> loading checkpoint ‘/home/sat001/wsp/isaac_3/IsaacGymEnvs-main/isaacgymenvs/runs/hor_1024_bs_65536_gpu_0/nn/hor_1024_bs_65536_gpu_0.pth’
reward: 252166624.0 steps: 510.0 ←➀
reward: 313811808.0 steps: 511.0 ←➁
reward: 313775776.0 steps: 511.0 ←➂
reward: 313775456.0 steps: 511.0
reward: 313775456.0 steps: 511.0
reward: 313775456.0 steps: 511.0
reward: 313775456.0 steps: 511.0
reward: 313775456.0 steps: 511.0
reward: 313775456.0 steps: 511.0
reward: 313775456.0 steps: 511.0
reward: 313775456.0 steps: 511.0
reward: 313775456.0 steps: 511.0
reward: 313775456.0 steps: 511.0
reward: 313775456.0 steps: 511.0
reward: 313775456.0 steps: 511.0
reward: 313775456.0 steps: 511.0
reward: 313775456.0 steps: 511.0
reward: 313775456.0 steps: 511.0
reward: 313775456.0 steps: 511.0
reward: 313775456.0 steps: 511.0
reward: 313775456.0 steps: 511.0
reward: 313775456.0 steps: 511.0
reward: 313775456.0 steps: 511.0
reward: 313775456.0 steps: 511.0

kellyg · December 2, 2021, 3:20pm

Hi, is this also related to your other post regarding reward exceeding the max value?

DDPG7 · December 2, 2021, 4:48pm

I want to ask that.

The length of the first episode is also different from the others.
I thought the operations of each test are completely independent, but from this situation it seems wrong.

DDPG7 · December 6, 2021, 5:54am

“FrankaCabinet” is in the same state.

“seed” is fixed to ‘42’.
torch_deterministic: True
Command:
python train.py task=FrankaCabinet num_envs=1 test=True checkpoint=runs/FrankaCabinet/nn/FrankaCabinet.pth seed=42

PyTorch: 1.8.2

Please check.

RunningMeanStd: (23,)
=> loading checkpoint ‘/home/sa/wsp/isaac_3/IsaacGymEnvs-main/isaacgymenvs/runs/FrankaCabinet/nn/FrankaCabinet.pth’
reward: 1389.047607421875 steps: 498.0 ←499.0？
reward: 1384.9691162109375 steps: 499.0
reward: 1390.197998046875 steps: 499.0
reward: 1388.5855712890625 steps: 499.0
reward: 1384.7918701171875 steps: 499.0
reward: 1384.7052001953125 steps: 499.0
reward: 1380.9183349609375 steps: 499.0
reward: 1385.69287109375 steps: 499.0
reward: 1396.2381591796875 steps: 499.0
reward: 1382.2991943359375 steps: 499.0
reward: 1383.98681640625 steps: 499.0
reward: 1393.5643310546875 steps: 499.0
reward: 1387.8167724609375 steps: 499.0
reward: 1390.0850830078125 steps: 499.0
reward: 1395.5955810546875 steps: 499.0
reward: 1379.6478271484375 steps: 499.0
reward: 1393.7684326171875 steps: 499.0
reward: 1393.6485595703125 steps: 499.0
reward: 1390.3907470703125 steps: 499.0
reward: 1391.0640869140625 steps: 499.0
reward: 1381.163818359375 steps: 499.0
reward: 1384.6175537109375 steps: 499.0
reward: 1389.2071533203125 steps: 499.0
reward: 1383.3948974609375 steps: 499.0
reward: 1395.4481201171875 steps: 499.0
reward: 1381.5181884765625 steps: 499.0
reward: 1392.1209716796875 steps: 499.0
reward: 1376.385009765625 steps: 499.0
reward: 1376.314453125 steps: 499.0
reward: 1377.595703125 steps: 499.0
reward: 1379.409912109375 steps: 499.0
reward: 1390.7479248046875 steps: 499.0
reward: 1379.56982421875 steps: 499.0
reward: 1381.3255615234375 steps: 499.0
reward: 1381.1376953125 steps: 499.0
reward: 1376.1226806640625 steps: 499.0

kellyg · December 6, 2021, 3:25pm

Where are you tracking the steps count?

DDPG7 · December 7, 2021, 12:56am

“steps count” is the value of “steps” above, isn’t it?
Is this wrong?
I don’t understand the intent of your question.
I’m not sure if this “steps count” is what you specify.

kellyg · December 7, 2021, 3:23pm

Yes sorry, I didn’t see the steps being printed when I ran a training, but I realized it was during inference. I believe this is coming from the rl_games side though. We don’t specifically control the number of steps per each iteration other than defining it in the configs.

DDPG7 · December 8, 2021, 12:09am

I see.

Topic		Replies	Views
"Rewards" in repetitive tests are clearly different Isaac Gym	11	901	December 14, 2021
Benchmarking `FrankaCubeStack` and `FrankaCabinet` Questions Isaac Gym	0	991	September 10, 2022
Randomness in Isaac Isaac Gym python	6	1099	April 23, 2022
The comment when the task is running Isaac Gym	1	519	December 16, 2021
Spawning Rigid Body in Isaac Lab is incorrect Isaac Sim isaac-sim-v4-2-0	6	300	October 14, 2024
Isaacgym preview 4 actor root state returns nans with isaacgymenvs-style task Isaac Gym	4	2178	June 30, 2023
How to calculate the value of “reward_scale” Isaac Gym	5	476	December 16, 2021
Customization of IsaacSim Deep RL example Isaac Sim	4	1493	March 30, 2023
Problems about playing a trained policy of snake robots in Isaac gym Isaac Gym	0	218	June 26, 2024
PPO for rl_games vs skrl Isaac Sim cuda , ubuntu , python , franka-arm	14	1939	September 6, 2023

During repeated tests, the first, second, third "reward" are different from the others

Related topics