Train custom Keras model using BYOM

amogh.dabholkar · August 4, 2023, 9:08pm

Please provide the following information when requesting support.

• Hardware (T4/V100/Xavier/Nano/etc)
• Network Type (Detectnet_v2/Faster_rcnn/Yolo_v4/LPRnet/Mask_rcnn/Classification/etc) Classification TF2
• TLT Version (Please run “tlt info --verbose” and share “docker_tag” here) v5.0.0
• Training spec file(If have, please share here) classification_tf2/tao_byom/specs/spec.yaml
• How to reproduce the issue ? (This is for errors. Please share the command line and the detailed log here.)

How do I train my custom Keras model using NVIDIA TAO 5.0?

BYOM model converter only converts models from torchvision and timm, whereas I have defined my own custom model. I couldn’t see a way to convert a .hdf5 file directly to .tltb using the converter.

I would prefer not to go via ONNX because it’s unnecessary, but if there is a way to convert .hdf5 file in such a way that it can be accepted by TAO, let me know

This is the pipeline I was hoping to follow:

Write my own model.py using Keras and then compile and save it as model.hdf5 [None of this is in TAO] I have attached the model here.
my_model.hdf5 (15.2 MB)
Using this model.hdf5 and encode.eff.py, I will convert the model into model.tltb or model.tlt, [not sure what the difference is]
I will then follow the BYOM notebook and hopefully my model is correctly loaded and trained.

But that’s not happening, there’s an error on running the training step. Here’s the error

Morganh · August 5, 2023, 3:33pm

In TAO 5.0.0, BYOM with TF1 (Classification and UNet) has been deprecated because the source code of TAO Toolkit is now fully open-sourced. To use BYOM with TF1, you will need to continue using TAO 4.0.

Classification TF2 still supports BYOM with the same workflow as TAO 4.0. If you wish to bring your own model weights in TAO 5.0.0, you can directly modify the source code to load the weights.

BYOM is a Python-based package that converts any open-source ONNX model to a TAO-compatible model.

To convert .hdf5 file to .tltb is not supported.

Suggest you to directly modify the source code to load your down model weights.

amogh.dabholkar · August 5, 2023, 3:37pm

But it’s not just model weights, it’s also model structure. Can I modify the source code to load a .hdf5 even if the model structure isn’t supported?

Morganh · August 5, 2023, 3:58pm

You can modify code to support your own model architecture.

amogh.dabholkar · August 5, 2023, 4:01pm

Just so I understand correctly, that would mean a lot of changes right? Add a backbone, change the list of accepted model architectures, and then add a function to load the model.

Is there a plan to add a low code version to do this Keras to .tltb conversion or do I have to rely on changing the source code throughout?

Morganh · August 5, 2023, 4:13pm

Yes, users can do the modification to meet custom requirement based on the open sourced code.

amogh.dabholkar · August 9, 2023, 3:17pm

How do I run training and evaluation if I pull the git repository locally and make changes to the source code? I instantiate the docker and then run the Jupyter notebook inside the docker?

OR

Can I just launch the notebooks from my conda env without the docker? My issue is, how do I make sure tao train uses the local version of files available in tao_tensorflow2_backend

Morganh · August 9, 2023, 3:30pm

You can login the docker via
$ docker run --runtime=nvidia -it --rm docker_name /bin/bash
Then find the original file. For example, if going to modify one train.py
$ find /usr |grep train.py

Backup it, and then copy the modified version of train.py to replace it.

amogh.dabholkar · August 9, 2023, 3:37pm

Thanks for the quick reply

So the confusion I have is that I want to modify a file in https://github.com/NVIDIA/tao_tensorflow2_backend/blob/main/nvidia_tao_tf2/cv/classification/model/model_builder.py

and then run the classification tf2 notebook in conda env. So how do I access the tao_tensorflow2_backend through the notebook is really my question.

Or vice-versa, if I modify https://github.com/NVIDIA/tao_tensorflow2_backend/blob/main/nvidia_tao_tf2/cv/classification/model/model_builder.py , how does that connect to when I run the classification tf2 notebook?

Morganh · August 9, 2023, 4:03pm

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

You can trigger docker via below way.
$ docker run --runtime=nvidia -it -p 8888:8888 --rm nvcr.io/nvidia/tao/tao-toolkit:5.0.0-tf2.11.0 /bin/bash

After modification, then trigger notebook.
root@851cd3d21645:/opt/nvidia# jupyter notebook --ip 0.0.0.0 --allow-root

system · August 29, 2023, 1:17am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
TAO BYOM Mobilenetv2 model TAO Toolkit	5	100	May 8, 2025
BYOM model trainer with Tao 5.5 toolkit TAO Toolkit tao	9	150	October 21, 2024
How to apply ONNX models to TAO Toolkit? TAO Toolkit tao	5	315	May 27, 2024
Prune and transform an existing TF2 model TAO Toolkit	2	434	October 31, 2023
BYOM support TAO Toolkit	2	435	October 20, 2023
Cannot convert ONNX model using tao_byom TAO Toolkit	4	1130	September 25, 2022
Error TAO BYOM - ONNX conversion TAO Toolkit onnx , tao	5	117	October 15, 2024
Is tao able to retrain existing caffe model/onnx model? TAO Toolkit	2	377	February 4, 2024
Does BYOM in TAO Toolkit support object detection models TAO Toolkit yolo	5	107	August 9, 2024
TAO Toolkit Converter TAO Toolkit	10	439	January 5, 2024

Train custom Keras model using BYOM

Related topics