Yolo Training Pipeline Simulation Error

Hi, I am trying to run the Yolo training example in order to train the yolo network to recognise bananas in addition to the default actors but have encountered an error in the simulation/randomisation of the carter robot:

The simulation starts correctly and begins domain randomisation immediately, carter spawns at pose: [1, 0, 0, 0, -13.00, 58.00, 0.92] (so the robot spawns in between the racking, outside the object-heavy zone).

My problem begins when carter teleports to it’s first random pose: some kind of physics glitch occurs which causes carter to be thrown around the map, ending up in a position unsuitable for training (usually facing the floor/ceiling and sometimes ending up outside the map).

my camera_teleportation configuration is as follows:

“camera_teleportation”: {
“isaac.ml.Teleportation”: {
“interval”: 2.0,
“name”: “carter_1”,
“min”: [-25.8, 60.2, 0.2],
“max”: [4.8, 65.2, 0.5],
“min_yaw”: -3.14,
“max_yaw”: 3.14,
“enable_translation_x”: true,
“enable_translation_y”: true,
“enable_translation_z”: true,
“enable_yaw”: true,
“tick_period”: “10.0”
}
},

I have modified the interval to experiment with different timings.

Another issue I am experiencing with this simulation is that the camera position on the carter robot appears to be behind the mesh of the robot itself, blocking the camera.

I have some screenshots of this camera position but I couldn’t figure out how to attach the screenshot to my topic post.

I have managed to solve both of these issues by starting the unreal simulation, which causes the carter robot to spawn and domain randomisation to start; then by selecting the carter robot and checking the box “actor hidden in game” under “rendering”, both issues are resolved.

Hi,

Thanks for the update, and glad you figured it out.

Liila

I was able to train the network successfully but after initial training I am unable to run the training script again.

After testing the training script I added a few more items to the list of classes and attempted to retrain the network, I have been met with the following error:

“TypeError: load_weights() got an unexpected keyword argument ‘skip_mismatch’”

Full trace:

Create YOLOv3 model with 9 anchors and 7 classes.
Traceback (most recent call last):
File “//.cache/bazel/_bazel_blacktooth/52d7e7281bb5403fe0ecfd37a01f1b8b/execroot/isaac/bazel-out/k8-opt/bin/apps/samples/yolo/yolo_training.runfiles/isaac/apps/samples/yolo/keras-yolo3/yolo_training.py”, line 357, in
_main()
File “//.cache/bazel/_bazel_blacktooth/52d7e7281bb5403fe0ecfd37a01f1b8b/execroot/isaac/bazel-out/k8-opt/bin/apps/samples/yolo/yolo_training.runfiles/isaac/apps/samples/yolo/keras-yolo3/yolo_training.py”, line 93, in _main
weights_path=’…/yolo_pretrained_models/yolo.h5’)
File “//.cache/bazel/_bazel_blacktooth/52d7e7281bb5403fe0ecfd37a01f1b8b/execroot/isaac/bazel-out/k8-opt/bin/apps/samples/yolo/yolo_training.runfiles/isaac/apps/samples/yolo/keras-yolo3/yolo_training.py”, line 243, in create_model
model_body.load_weights(weights_path, by_name=True, skip_mismatch=True)
TypeError: load_weights() got an unexpected keyword argument ‘skip_mismatch’

A little bit of googleing has suggested that the yolov3 model was written with keras v.2.1.5 and any other version is incompatible.

I have tried running the script within a few venvs with different versions of keras and tensorflow but without success, and I am struggling to find the bazel build configurations that define the pip package requirements.

Where can I find the pip package requirements for the yolo package and/or where can I define a version number for keras/tensorflow within bazel’s build requirements?