What is "export specification file" for TAO deploy?

Hello there,

Recently, we have trained a custom YOLOv4 model with the NVIDIA TAO API, and exported the trained model.onnx file. Now, the goal is to use TAO deploy to convert to TensorRT engine, on Jetson hardware.

My team has set up tao-deploy on the Jetson successfully by pulling the appropriate TensorRT container and installing tao-deploy with pip.

Instructions say “Same spec file can be used as the tao model yolo_v4 export command”:

  • -e, --experiment_spec: The experiment spec file to set up the TensorRT engine generation. This should be the same as the export specification file.

Where do we obtain this “export specification file” ? Is this produced at the time of export API action being run?

We used the API to download all file artifacts generated by export API action:

labels.txt   logs_from_toolkit.txt   model.onnx   nvinfer_config.txt  status.json

None of these seem to the correct file. Any suggestions?

Furthermore, if you can point us to a sample experiment_spec file for gen_trt_engine that would be very helpful.

To generate a tensorrt engine, you can use trtexec to run. Please see TRTEXEC with YOLO_v4 - NVIDIA Docs.

For the spec file, you can refer to spec files under notebook folder. tao_tutorials/notebooks/tao_launcher_starter_kit/yolo_v4/specs/yolo_v4_train_resnet18_kitti.txt at main · NVIDIA/tao_tutorials · GitHub.

Thank you. Is there a way to run the model inference with Triton server on Jetson, then?

We have this link, tao-toolkit-triton-apps/docs/configuring_the_client.md at main · NVIDIA-AI-IOT/tao-toolkit-triton-apps · GitHub, which describes how to infer a TAO-trained YOLOv3 model by serving via Triton. Can we do something similar for YOLOv4?

My plan is basically to use the TAO API to run export and gen_trt_engine actions.

However, after this, instead of inference action in TAO API, I guess we’d need to manually create a config.pbtxt and .plan for model serving in Triton, on the Jetson AGX, which as far as I can tell is outside the scope of the TAO API - hence the TAO-triton approach in the link.

Yes, it is possible. You can leverage the preprocessing and postprocessing code in tao-tf1 and tao-deploy.

Yes, config.pbtxt is needed to generate. And .plan is actually the tensorrt engine. The tao-toolkit-triton-apps has some commands/examples about how to generate it.