Train custom Keras model using BYOM

Please provide the following information when requesting support.

• Hardware (T4/V100/Xavier/Nano/etc)
• Network Type (Detectnet_v2/Faster_rcnn/Yolo_v4/LPRnet/Mask_rcnn/Classification/etc) Classification TF2
• TLT Version (Please run “tlt info --verbose” and share “docker_tag” here) v5.0.0
• Training spec file(If have, please share here) classification_tf2/tao_byom/specs/spec.yaml
• How to reproduce the issue ? (This is for errors. Please share the command line and the detailed log here.)

How do I train my custom Keras model using NVIDIA TAO 5.0?

BYOM model converter only converts models from torchvision and timm, whereas I have defined my own custom model. I couldn’t see a way to convert a .hdf5 file directly to .tltb using the converter.

I would prefer not to go via ONNX because it’s unnecessary, but if there is a way to convert .hdf5 file in such a way that it can be accepted by TAO, let me know

This is the pipeline I was hoping to follow:

  1. Write my own model.py using Keras and then compile and save it as model.hdf5 [None of this is in TAO] I have attached the model here.
    my_model.hdf5 (15.2 MB)

  2. Using this model.hdf5 and encode.eff.py, I will convert the model into model.tltb or model.tlt, [not sure what the difference is]

  3. I will then follow the BYOM notebook and hopefully my model is correctly loaded and trained.

But that’s not happening, there’s an error on running the training step. Here’s the error

In TAO 5.0.0, BYOM with TF1 (Classification and UNet) has been deprecated because the source code of TAO Toolkit is now fully open-sourced. To use BYOM with TF1, you will need to continue using TAO 4.0.

Classification TF2 still supports BYOM with the same workflow as TAO 4.0. If you wish to bring your own model weights in TAO 5.0.0, you can directly modify the source code to load the weights.

BYOM is a Python-based package that converts any open-source ONNX model to a TAO-compatible model.

To convert .hdf5 file to .tltb is not supported.

Suggest you to directly modify the source code to load your down model weights.

But it’s not just model weights, it’s also model structure. Can I modify the source code to load a .hdf5 even if the model structure isn’t supported?

You can modify code to support your own model architecture.

Just so I understand correctly, that would mean a lot of changes right? Add a backbone, change the list of accepted model architectures, and then add a function to load the model.

Is there a plan to add a low code version to do this Keras to .tltb conversion or do I have to rely on changing the source code throughout?

Yes, users can do the modification to meet custom requirement based on the open sourced code.

How do I run training and evaluation if I pull the git repository locally and make changes to the source code? I instantiate the docker and then run the Jupyter notebook inside the docker?

OR

Can I just launch the notebooks from my conda env without the docker? My issue is, how do I make sure tao train uses the local version of files available in tao_tensorflow2_backend

You can login the docker via
$ docker run --runtime=nvidia -it --rm docker_name /bin/bash
Then find the original file. For example, if going to modify one train.py
$ find /usr |grep train.py

Backup it, and then copy the modified version of train.py to replace it.

Thanks for the quick reply

So the confusion I have is that I want to modify a file in https://github.com/NVIDIA/tao_tensorflow2_backend/blob/main/nvidia_tao_tf2/cv/classification/model/model_builder.py

and then run the classification tf2 notebook in conda env. So how do I access the tao_tensorflow2_backend through the notebook is really my question.

Or vice-versa, if I modify https://github.com/NVIDIA/tao_tensorflow2_backend/blob/main/nvidia_tao_tf2/cv/classification/model/model_builder.py , how does that connect to when I run the classification tf2 notebook?

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

You can trigger docker via below way.
$ docker run --runtime=nvidia -it -p 8888:8888 --rm nvcr.io/nvidia/tao/tao-toolkit:5.0.0-tf2.11.0 /bin/bash

After modification, then trigger notebook.
root@851cd3d21645:/opt/nvidia# jupyter notebook --ip 0.0.0.0 --allow-root

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.