Convert Dataset to TFRecords for TAO

,

So, you already finished the tfrecord generation. You can find the tfrecord files under /workspace/tao-experiments/local/training/tao/detectnet_v2/resnet18_palletjack/5k_model_synthetic when you login the docker via

$ docker run -it --rm --gpus all -v $LOCAL_PROJECT_DIR:/workspace/tao-experiments $DOCKER_CONTAINER run ls /workspace/tao-experiments/local/training/tao/detectnet_v2/resnet18_palletjack/5k_model_synthetic

Or you can find them in your local path, i.e.,

$ ls $LOCAL_PROJECT_DIR

There were some issues with the code on step 7. Visualize Model Performance. That’s why it was throwing errors.

I’ve corrected the code and added some comments explaining what it was:

from IPython.display import Image 
import glob

# results_dir = os.path.join(os.environ["LOCAL_PROJECT_DIR"], "local/training/tao/detectnet_v2/resnet18_palletjack/test_loco/images_annotated")
# the original code tries to fetch inference result files from "local/training/tao/detectnet_v2/resnet18_palletjack/test_loco/images_annotated"
# but the inference code saves the results files in " -o /workspace/tao-experiments/local/training/tao/detectnet_v2/resnet18_palletjack/5k_model_synthetic \"

# 1. Point results_dir to the actual inference output
results_dir = os.path.join(
    os.environ["LOCAL_PROJECT_DIR"],
    "local/training/tao/detectnet_v2/resnet18_palletjack/5k_model_synthetic/images_annotated"
)

# pil_img = Image(filename=os.path.join(os.getenv("LOCAL_PROJECT_DIR"), 'detectnet_v2/july_resnet18_trials/new_pellet_distractors_10k/test_loco/images_annotated/1564562568.298206.jpg'))

# 2. Fix pil_img to use the same results_dir
# pil_img = Image(filename=os.path.join(results_dir, "1564562568.298206.jpg"))
# pil_img is not being used anywhere in the code

# image_names = ["1564562568.298206.jpg", "1564562628.517229.jpg", "1564562843.0618184.jpg", "593768,3659.jpg", "516447400,977.jpg"] 
# These are not the actual file names that got created   

# 2. Discover all image files (jpg, jpeg, png)
image_names = sorted(
    [os.path.basename(p) for p in glob.glob(os.path.join(results_dir, "*.jpg"))]
    + [os.path.basename(p) for p in glob.glob(os.path.join(results_dir, "*.jpeg"))]
    + [os.path.basename(p) for p in glob.glob(os.path.join(results_dir, "*.png"))]
)

images = [Image(filename = os.path.join(results_dir, image_name)) for image_name in image_names]

print(f"Found {len(image_names)} images")

display(*images)

Great. Please share the info to the github owner. It is not a release from TAO.

I created this repo correcting all the errors from the original documentation and providing a more user-friendly step by step on how to run this model for those who are interested:

1 Like

Great. Really appreciate for your work.

Here is an updated version for the entire pipeline: End-to-End-Pipeline-for-Robotics-ML-Training-with-Synthetic-Data-on-Nvidia-Isaac-SIM/README.md at main · marcelpatrick/End-to-End-Pipeline-for-Robotics-ML-Training-with-Synthetic-Data-on-Nvidia-Isaac-SIM · GitHub

Thanks for the info. Appreciate it.