We have trained models using TAO classification and detection pipelines. For our use case we want to deploy the models into Azure and AWS functions and run inference on the CPU.
Before working with the TAO, we would have just used Keras or TF for training, and then loaded the saved model from the Azure/AWS function runtime and done inference via a light-weight python script.
How can we accomplish the same thing (CPU inference, minimal dependency python script that we can host in the AWS/Azure function runtime), when we start from a .tlt model?
Please export the .tlt model to .etlt model.
Then you can run tao-converter to convert .etlt model to tensorrt engine(.engine or .trt file) .
Then run inference with deepstream or your standalone script.
For deepstream, refer to Image Classification — Transfer Learning Toolkit 3.0 documentation
For other kind of inference, you can also refer to tao-toolkit-triton-apps/tao_triton/python at main · NVIDIA-AI-IOT/tao-toolkit-triton-apps · GitHub and tao-toolkit-triton-apps/classification_postprocessor.py at main · NVIDIA-AI-IOT/tao-toolkit-triton-apps · GitHub
To run inference against classification tensorrt engine, you can also search and find some relevant scripts in this TAO forum.
This is very helpful Morganh. Thanks!
@Morganh - question on those lines. It looks like both require a GPU backend for executing inference. Do you have a pointer to a CPU only single python script (no triton or deepstream) with vanilla TF that can load up the graph and run inference?
No, there is not such kind of script for running inference on the CPU.
@Morganh - do that mean it is not possible? Or just that NVIDIA hasn’t created that kind of script yet?
Yes, officially it will run inference on GPU.
There is no update from you for a period, assuming this is not an issue anymore.
Hence we are closing this topic. If need further support, please open a new one.
July 20, 2022, 6:23am
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.