Hello there,
Recently, we have trained a custom YOLOv4 model with the NVIDIA TAO API, and exported the trained model.onnx
file. Now, the goal is to use TAO deploy to convert to TensorRT engine, on Jetson hardware.
My team has set up tao-deploy
on the Jetson successfully by pulling the appropriate TensorRT container and installing tao-deploy with pip.
Instructions say “Same spec file can be used as the tao model yolo_v4 export
command”:
-e, --experiment_spec
: The experiment spec file to set up the TensorRT engine generation. This should be the same as the export specification file.
Where do we obtain this “export specification file” ? Is this produced at the time of export
API action being run?
We used the API to download all file artifacts generated by export
API action:
labels.txt logs_from_toolkit.txt model.onnx nvinfer_config.txt status.json
None of these seem to the correct file. Any suggestions?
Furthermore, if you can point us to a sample experiment_spec
file for gen_trt_engine
that would be very helpful.
Thank you. Is there a way to run the model inference with Triton server on Jetson, then?
We have this link, tao-toolkit-triton-apps/docs/configuring_the_client.md at main · NVIDIA-AI-IOT/tao-toolkit-triton-apps · GitHub, which describes how to infer a TAO-trained YOLOv3 model by serving via Triton. Can we do something similar for YOLOv4?
My plan is basically to use the TAO API to run export
and gen_trt_engine
actions.
However, after this, instead of inference
action in TAO API, I guess we’d need to manually create a config.pbtxt
and .plan
for model serving in Triton, on the Jetson AGX, which as far as I can tell is outside the scope of the TAO API - hence the TAO-triton approach in the link.
Yes, it is possible. You can leverage the preprocessing and postprocessing code in tao-tf1 and tao-deploy.
Yes, config.pbtxt is needed to generate. And .plan is actually the tensorrt engine. The tao-toolkit-triton-apps has some commands/examples about how to generate it.