How can I convert a tlt model to run inference in a light-weight python script on CPU?

matt.berseth · May 17, 2022, 11:40am

We have trained models using TAO classification and detection pipelines. For our use case we want to deploy the models into Azure and AWS functions and run inference on the CPU.

Before working with the TAO, we would have just used Keras or TF for training, and then loaded the saved model from the Azure/AWS function runtime and done inference via a light-weight python script.

How can we accomplish the same thing (CPU inference, minimal dependency python script that we can host in the AWS/Azure function runtime), when we start from a .tlt model?

Thanks!

Morganh · May 18, 2022, 2:12am

Please export the .tlt model to .etlt model.
Then you can run tao-converter to convert .etlt model to tensorrt engine(.engine or .trt file) .
Then run inference with deepstream or your standalone script.
For deepstream, refer to Image Classification — Transfer Learning Toolkit 3.0 documentation
For other kind of inference, you can also refer to tao-toolkit-triton-apps/tao_triton/python at main · NVIDIA-AI-IOT/tao-toolkit-triton-apps · GitHub and tao-toolkit-triton-apps/classification_postprocessor.py at main · NVIDIA-AI-IOT/tao-toolkit-triton-apps · GitHub
To run inference against classification tensorrt engine, you can also search and find some relevant scripts in this TAO forum.

matt.berseth · May 18, 2022, 10:22am

This is very helpful Morganh. Thanks!

matt.berseth · May 20, 2022, 10:46am

@Morganh - question on those lines. It looks like both require a GPU backend for executing inference. Do you have a pointer to a CPU only single python script (no triton or deepstream) with vanilla TF that can load up the graph and run inference?

Morganh · May 20, 2022, 2:28pm

No, there is not such kind of script for running inference on the CPU.

matt.berseth · May 20, 2022, 4:46pm

@Morganh - do that mean it is not possible? Or just that NVIDIA hasn’t created that kind of script yet?

Morganh · May 21, 2022, 2:17am

Yes, officially it will run inference on GPU.

yingliu · July 6, 2022, 6:22am

There is no update from you for a period, assuming this is not an issue anymore.
Hence we are closing this topic. If need further support, please open a new one.
Thanks

system · July 20, 2022, 6:23am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
GPU difference inference TAO Toolkit	11	986	February 18, 2022
Using TLT trained models in TF (or Keras) to run inference TAO Toolkit	9	1428	October 12, 2021
How to run model generated through TLT/TAO on CPU TAO Toolkit	6	576	October 14, 2021
How preform inference retinanet using a TLT export .engine file by python TAO Toolkit tensorrt	4	885	October 12, 2021
Use TensorRT model with TAO Toolkit inference TAO Toolkit omniverse_extension	5	993	February 9, 2022
Do inference of tao generated engine in python without deepstream TAO Toolkit jetson-inference , python , tf-trt , tao , deepstream	2	1537	February 18, 2022
Inference on .etlt model TAO Toolkit	7	1525	December 7, 2021
How to run nvidia pretrained model directly on T4（or similar cards, not edge device）? TAO Toolkit tensorrt , python	3	681	March 10, 2022
Performance drop when performing inference with a .trt engine on a python script TensorRT tensorrt	3	479	February 18, 2022
Triton deployment and inference TAO Toolkit	4	1304	July 27, 2021

How can I convert a tlt model to run inference in a light-weight python script on CPU?

Related topics