Installing Triton Server on Lenovo SE70 with Xavier NX

Hi,

Thanks a lot for your patience.

It turns out that the nvcr.io/nvidia/tritonserver container does work well on JetPack 5.
Please see below for the testing.

Server: tritonserver:24.02-py3-igpu

$ git clone -b r24.02 https://github.com/triton-inference-server/server.git
$ cd server/docs/examples/
$ ./fetch_models.sh 
$ sudo docker run -it --rm --runtime nvidia --network host -v ${PWD}/model_repository:/models nvcr.io/nvidia/tritonserver:24.02-py3-igpu tritonserver --model-repository=/models

You should see the backend and model logs like below:

...
I0327 04:32:46.516401 1 server.cc:634] 
+-------------+-----------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------+
| Backend     | Path                                                            | Config                                                                                                                          |
+-------------+-----------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------+
| tensorflow  | /opt/tritonserver/backends/tensorflow/libtriton_tensorflow.so   | {"cmdline":{"auto-complete-config":"true","backend-directory":"/opt/tritonserver/backends","min-compute-capability":"5.300000", |
|             |                                                                 | "default-max-batch-size":"4"}}                                                                                                  |
| onnxruntime | /opt/tritonserver/backends/onnxruntime/libtriton_onnxruntime.so | {"cmdline":{"auto-complete-config":"true","backend-directory":"/opt/tritonserver/backends","min-compute-capability":"5.300000", |
|             |                                                                 | "default-max-batch-size":"4"}}                                                                                                  |
+-------------+-----------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------+

I0327 04:32:46.516911 1 server.cc:677] 
+----------------------+---------+--------+
| Model                | Version | Status |
+----------------------+---------+--------+
| densenet_onnx        | 1       | READY  |
| inception_graphdef   | 1       | READY  |
| simple               | 1       | READY  |
| simple_dyna_sequence | 1       | READY  |
| simple_identity      | 1       | READY  |
| simple_int8          | 1       | READY  |
| simple_sequence      | 1       | READY  |
| simple_string        | 1       | READY  |
+----------------------+---------+--------+
...

Client: tritonserver:24.02-py3-igpu-sdk

You should be able to see the detection output by sending a query like below.
We test this on another XavierNX but it should be okay to run on the same device.

$ sudo docker run -it --rm --runtime nvidia --network host nvcr.io/nvidia/tritonserver:24.02-py3-igpu-sdk
# /workspace/install/bin/image_client -u [IP]:8000 -m densenet_onnx -c 3 -s INCEPTION /workspace/images/mug.jpg
Request 0, batch size 1
Image '/workspace/images/mug.jpg':
    15.349564 (504) = COFFEE MUG
    13.227465 (968) = CUP
    10.424894 (505) = COFFEEPOT

Thanks.