I have 2 questions:
1- I tried to change the number of epochs in jetbot_train.py file. but the model still running more than the number that I specified for the n_epochs.
2- when I run the model in a regular mode, the “value_loss” goes down but when I run it in headless mode, the “value_loss” is increasing. Why?
“–total_steps”,
help=“the total number of steps before exiting and saving a final checkpoint”,
It’s the total_steps that controls how long you run the training.
How long was your training? You can plot with the tensorboard step in this and watch the reward https://docs.omniverse.nvidia.com/app_isaacsim/app_isaacsim/sample_training_rl.html#jetracer-lane-following-sample
The reward/loss goes up and down as the training goes, as you can see in the tensorboard plots. Usually it takes more than 50k steps to start getting good results.
Thank you for your reply!
I am running the sample in this device: Alienware Aurora R9
Using the default value for “–total_steps” It took 2 days to reach 30 iterations and after that crashed.
I run the code again by changing the “–total_steps” to 50000. It took almost 1 hour to run and after reaching to 50000 steps it terminated by below terminal.
I have no idea if the training has finished normally or not.
If it is normal, now how can I run the sample?
I mean if its trained now, how can I see the output? how can I use this trained model?
It saves out models frequently in the params folder.
checkpoint_callback = CheckpointCallback(save_freq=args.save_freq, save_path=“./params/”, name_prefix=“rl_model”)
It’s mentioned here.
There are lots of useful information on that page, including how to evaluate a trained model, how to continue training, etc. You can let it train overnight once it sta
Can you please check the provided link? it doesnt work for me!
Sorry, i edited the link https://docs.omniverse.nvidia.com/app_isaacsim/app_isaacsim/sample_training_rl.html#evaluate-trained-models
I usually let it train overnight, once verifying everything works to your liking.
Thank you for reply.
I have 2 questions:
1-How can I save synthetic data?
2-is it possible instead of CNN for training I use another deep learning model? for example Faster RCNN?
- https://docs.omniverse.nvidia.com/app_isaacsim/app_isaacsim/sample_syntheticdata.html#synthetic-data
- https://docs.omniverse.nvidia.com/app_isaacsim/app_isaacsim/sample_dofbot.html#training-the-cube-detection-using-isaac-sim-s-synthetic-data-pipeline
You can search for rcnn in the code base. The online_generation is using torchvision.models.detection.maskrcnn_resnet50_fpn
The dofbot is using torchvision.models.detection.fasterrcnn_mobilenet_v3_large_fpn.
Are the steps in the doc not clear? Did you try running those?
https://isaac.gitlab-master-pages.nvidia.com/omni_isaac_sim/app_isaacsim/app_isaacsim/sample_dofbot.html#training-the-cube-detection-using-isaac-sim-s-synthetic-data-pipeline
Thank you.
I have a question about robot navigation. It is not related to training a model.
How can I find a robot navigation guide from scratch in isaac sim? I found some tutorials on isaac stack navigation but I dont know how to apply that on isaac sim.
If you search “navigation” in the docs, you’ll see all these ROS/ROS2 navigation guides as well.
https://docs.omniverse.nvidia.com/app_isaacsim/app_isaacsim/sample_ros_nav.html
https://docs.omniverse.nvidia.com/app_isaacsim/app_isaacsim/sample_ros2_nav.html?highlight=navigation
I got the same error. Have you found a solution?