TAO Toolkit encapsulated DNN subtaks

Hi all, I would like to inspect the code for the TAO Toolkit encapsulated DNN subtasks such as train , prune , evaluate , export et. I have tried to access the TAO container nvcr.io/nvidia/tao/tao-toolkit-tf:v3.21.08-py3 directly but I can’t find the scripts for those subtasks. I would like to understand the pipeline functions behind the encapsulation commands.

Sorry, the TAO toolkit is not open source.

Hi @Morganh. I got it, what other method can I use to understand the training subtasks in terms of data loading, memory copy, model loading, network training, saving models, active kernels, etc. I have tried to run the training step with the Nvidia Nsigth Systems as !nsys profile tao retinanet train... but the profiling report only shows this:

And Also I tried to profile with Nvidia Nsight Compute as /usr/local/NVIDIA-Nsight-Compute/ncu --export nv_compute_test --target-processes all tao retinanet train but got the below message after the training was completed:

2021-10-20 23:04:57,433 [INFO] tlt.components.docker_handler.docker_handler: Stopping container.
==PROF== Target process 27608 terminated before first instrumented API call.
==PROF== Target process 27610 terminated before first instrumented API call.
==WARNING== No kernels were profiled.

The tools mentioned above do not compatible with TLT/TAO. TAO Toolkit is a Python package hosted on the NVIDIA Python Package Index. It is used with NVIDIA pre-trained models to create custom Computer Vision (CV) and Conversational AI models with the user’s own data. Training AI models using TAO Toolkit does not require expertise in AI or deep learning. A simplified Command Line Interface (CLI) abstracts away AI framework complexity enabling users to build production quality AI models using a simple spec file and one of the NVIDIA pre-trained models.

Hi @Morganh, thanks for your detailed explanation about how TAO Toolkit works. I still need to profile the training part of it. Is there another profiling tool compatible with TLT/TAO?

Currently TLT/TAO does not implement profiling tool or MLOPs tools.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.