Installing Triton Server on Lenovo SE70 with Xavier NX

I am trying to install Triton Server on the Lenovo SE70 with built-in Xavier NX, and I have been running into issues. The SE70 supports JP 4.5.1 on Ubuntu 18.04, JP 5.0.2 on Ubuntu 20.04, and JP 5.1.1 on Ubuntu 20.04.

Triton Server seems to be running fine on JP 4.5.1, but I want to use it on a later version of JP if possible. I’ve tried it on JP 5.0.2, but I ran into issues saying that the GPU isn’t being detected when starting the Triton Server docker container.

I tried installing the release that supports JP 5.1.1 and unfortunately, the runtime dependency “libre2-9” is only available on Ubuntu 22.04, which I can’t install onto the SE70.

I am concerned that I may not be installing correctly on JP 5.0.2, so could anyone provide some specific step-by-step instructions on how to install Triton Server on JP 5.0.2? Thank you.

Hi,

Which container did you try to use?
The official Triton container needs a JetPack 6 environment.

For Xavier NX, please follow the below topic to install the packages:

Thanks.

For JP 5.0.2 with Ubuntu 20.04, I tried container 22.10. For JP 5.1.1 with Ubuntu 20.04, I tried container 23.05.

I will try out those installation steps and post the result of trying them. Thanks in advance.

Here is the output that I received when running the command below:

root@test-desktop:/home/test# /opt/tritonserver/bin/tritonserver --model-repository=./model_repository --backend-directory=/opt/tritonserver/backends --backend-config=tensorflow,version=2
I0220 16:33:06.853942 3321010 pinned_memory_manager.cc:240] Pinned memory pool is created at ‘0x202f0a000’ with size 268435456
I0220 16:33:06.854632 3321010 cuda_memory_manager.cc:105] CUDA memory pool is created on device 0 with size 67108864
I0220 16:33:06.892931 3321010 model_lifecycle.cc:459] loading: simple:1
I0220 16:33:07.633477 3321010 tensorflow.cc:2577] TRITONBACKEND_Initialize: tensorflow
I0220 16:33:07.633592 3321010 tensorflow.cc:2587] Triton TRITONBACKEND API version: 1.12
I0220 16:33:07.633645 3321010 tensorflow.cc:2593] ‘tensorflow’ TRITONBACKEND API version: 1.12
I0220 16:33:07.633684 3321010 tensorflow.cc:2617] backend configuration:
{“cmdline”:{“auto-complete-config”:“true”,“backend-directory”:“/opt/tritonserver/backends”,“min-compute-capability”:“5.300000”,“version”:“2”,“default-max-batch-size”:“4”}}
I0220 16:33:07.634156 3321010 tensorflow.cc:2683] TRITONBACKEND_ModelInitialize: simple (version 1)
I0220 16:33:07.636693 3321010 tensorflow.cc:2732] TRITONBACKEND_ModelInstanceInitialize: simple (GPU device 0)
2024-02-20 11:33:09.827418: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:999] could not open file to read NUMA node: /sys/bus/pci/devices/0000:00:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-02-20 11:33:10.097072: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:999] could not open file to read NUMA node: /sys/bus/pci/devices/0000:00:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-02-20 11:33:10.097503: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:999] could not open file to read NUMA node: /sys/bus/pci/devices/0000:00:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-02-20 11:33:10.098559: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:999] could not open file to read NUMA node: /sys/bus/pci/devices/0000:00:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-02-20 11:33:10.099073: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:999] could not open file to read NUMA node: /sys/bus/pci/devices/0000:00:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-02-20 11:33:10.099300: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1725] Could not identify NUMA node of platform GPU id 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2024-02-20 11:33:10.099547: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:999] could not open file to read NUMA node: /sys/bus/pci/devices/0000:00:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-02-20 11:33:10.099806: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1638] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 863 MB memory: → device: 0, name: Xavier, pci bus id: 0000:00:00.0, compute capability: 7.2
2024-02-20 11:33:10.209036: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:353] MLIR V1 optimization pass is not enabled
I0220 16:33:10.211611 3321010 model_lifecycle.cc:694] successfully loaded ‘simple’ version 1
*I0220 16:33:10.211947 3321010 server.cc:583] *
±-----------------±-----+
| Repository Agent | Path |
±-----------------±-----+
±-----------------±-----+

*I0220 16:33:10.212373 3321010 server.cc:610] *
±-----------±--------------------------------------------------------------±--------------------------------------------------------------+
| Backend | Path | Config |
±-----------±--------------------------------------------------------------±--------------------------------------------------------------+
| tensorflow | /opt/tritonserver/backends/tensorflow/libtriton_tensorflow.so | {“cmdline”:{“auto-complete-config”:“true”,“backend-directory” |
| | | :“/opt/tritonserver/backends”,“min-compute-capability”:"5.300 |
| | | 000",“version”:“2”,“default-max-batch-size”:“4”}} |
±-----------±--------------------------------------------------------------±--------------------------------------------------------------+

*I0220 16:33:10.213246 3321010 server.cc:653] *
±-------±--------±-------+
| Model | Version | Status |
±-------±--------±-------+
| simple | 1 | READY |
±-------±--------±-------+

W0220 16:33:10.214026 3321010 metrics.cc:352] No polling metrics (CPU, GPU) are enabled. Will not poll for them.
*I0220 16:33:10.214766 3321010 tritonserver.cc:2387] *
±---------------------------------±---------------------------------------------------------------------------------------------------------+
| Option | Value |
±---------------------------------±---------------------------------------------------------------------------------------------------------+
| server_id | triton |
| server_version | 2.33.0 |
| server_extensions | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_confi |
| | guration system_shared_memory cuda_shared_memory binary_tensor_data parameters statistics trace logging |
| model_repository_path[0] | ./model_repository |
| model_control_mode | MODE_NONE |
| strict_model_config | 0 |
| rate_limit | OFF |
| pinned_memory_pool_byte_size | 268435456 |
| cuda_memory_pool_byte_size{0} | 67108864 |
| min_supported_compute_capability | 5.3 |
| strict_readiness | 1 |
| exit_timeout | 30 |
| cache_enabled | 0 |
±---------------------------------±---------------------------------------------------------------------------------------------------------+

I0220 16:33:10.231070 3321010 grpc_server.cc:2450] Started GRPCInferenceService at 0.0.0.0:8001
I0220 16:33:10.232026 3321010 http_server.cc:3555] Started HTTPService at 0.0.0.0:8000
I0220 16:33:10.278758 3321010 http_server.cc:185] Started Metrics Service at 0.0.0.0:8002

In a separate terminal, I ran the command below and got this output:

root@test-desktop:/home/test# /opt/tritonserver/clients/bin/perf_analyzer -m simple
**** Measurement Settings ****

  • Batch size: 1*
  • Service Kind: Triton*
  • Using “time_windows” mode for stabilization*
  • Measurement window: 5000 msec*
  • Using synchronous calls for inference*
  • Stabilizing using average latency*

Request concurrency: 1

  • Client: *
  • Request count: 14448*
  • Throughput: 802.223 infer/sec*
  • Avg latency: 1244 usec (standard deviation 434 usec)*
  • p50 latency: 1150 usec*
  • p90 latency: 1478 usec*
  • p95 latency: 1699 usec*
  • p99 latency: 3031 usec*
  • Avg HTTP time: 1231 usec (send/recv 190 usec + response wait 1041 usec)*
  • Server: *
  • Inference count: 14449*
  • Execution count: 14449*
  • Successful request count: 14449*
  • Avg request latency: 553 usec (overhead 65 usec + queue 67 usec + compute input 34 usec + compute infer 366 usec + compute output 19 usec)*

Inferences/Second vs. Client Average Batch Latency
Concurrency: 1, throughput: 802.223 infer/sec, latency 1244 usec

Does this indicate that Triton Server is running?

Hi,

Yes, it looks like the server is running.
You can further verify it with the below example:

https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/getting_started/quickstart.html

Thanks.

Thank you for the help!

Is my assumption correct that this container that I followed the installation steps in the other post you referred me to supports the use of docker?

Hi,

Yes, but since the triton server container requires a JetPack 6 environment.
You will need to build a custom one on the top of the container that supports JetPack 5.

Maybe you can try nvcr.io/nvidia/l4t-ml:r35.2.1-py3.
It has all the common ML frameworks pre-installed that can be used as a Triton backend.
However, the container size is relatively large due to these libraries.

Thanks.

Once I have built the container that supports Jetpack 5, how do I verify that it is working in coordination with Triton Server?

Hi,

You can test it with the sample shared in the below link:

https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/getting_started/quickstart.html

Thanks.

Why do I receive a warning saying that the NVIDIA Driver was not detected? I thought the NVIDIA driver was included with the image.

test@test-desktop:~/Downloads$ sudo docker run -it --rm --net=host nvcr.io/nvidia/tritonserver:23.04-py3-sdk
[sudo] password for test:

=================================
== Triton Inference Server SDK ==

NVIDIA Release 23.04 (build 58408270)

Copyright (c) 2018-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

Various files include modifications (c) NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:

WARNING: The NVIDIA Driver was not detected. GPU functionality will not be available.
Use the NVIDIA Container Toolkit to start this container with GPU support; see
NVIDIA Cloud Native Technologies - NVIDIA Docs .

Hi,

Do you need the client app to run on Jetson?
Since the tritonserver container cannot work on JetPack 5 as previously mentioned.

Thanks.

Yes, I need the client app to run on Jetson. I installed that ML L4T framework that you’ve mentioned to build a custom container that supports Jetpack 5 on top of the Triton Server Container and was trying to send a sample inference request using the link you sent.

Sorry, as I am fairly new to all of this.

Hi,

Would you mind sharing the Dockerfile with us for checking?

The Tritonserver container doesn’t support JetPack 5.
Does the container work well with the test shared in the below topic:
/opt/tritonserver/clients/bin/perf_analyzer -m simple

Thanks.

Here is the dockerfile contents:

Copyright 2019-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

Redistribution and use in source and binary forms, with or without

modification, are permitted provided that the following conditions

are met:

* Redistributions of source code must retain the above copyright

notice, this list of conditions and the following disclaimer.

* Redistributions in binary form must reproduce the above copyright

notice, this list of conditions and the following disclaimer in the

documentation and/or other materials provided with the distribution.

* Neither the name of NVIDIA CORPORATION nor the names of its

contributors may be used to endorse or promote products derived

from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS ``AS IS’’ AND ANY

EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE

IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR

PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR

CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,

EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,

PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR

PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY

OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT

(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE

OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Multistage build.

Base image on the minimum Triton container

ARG BASE_IMAGE=nvcr.io/nvidia/tritonserver:24.01-py3-min

ARG TRITON_CLIENT_REPO_SUBDIR=clientrepo
ARG TRITON_COMMON_REPO_TAG=main
ARG TRITON_CORE_REPO_TAG=main
ARG TRITON_THIRD_PARTY_REPO_TAG=main
ARG TRITON_MODEL_ANALYZER_REPO_TAG=main
ARG TRITON_ENABLE_GPU=ON
ARG JAVA_BINDINGS_MAVEN_VERSION=3.8.4
ARG JAVA_BINDINGS_JAVACPP_PRESETS_TAG=1.5.8

DCGM version to install for Model Analyzer

ARG DCGM_VERSION=3.2.6

ARG NVIDIA_TRITON_SERVER_SDK_VERSION=unknown
ARG NVIDIA_BUILD_ID=unknown

############################################################################

Build image

############################################################################

FROM ${BASE_IMAGE} AS sdk_build

Ensure apt-get won’t prompt for selecting options

ENV DEBIAN_FRONTEND=noninteractive

RUN apt-get update &&
apt-get install -y --no-install-recommends
ca-certificates
software-properties-common
autoconf
automake
build-essential
curl
git
gperf
libb64-dev
libgoogle-perftools-dev
libopencv-dev
libopencv-core-dev
libssl-dev
libtool
pkg-config
python3
python3-pip
python3-dev
rapidjson-dev
vim
wget
python3-pdfkit
openjdk-11-jdk
maven &&
pip3 install --upgrade wheel setuptools &&
pip3 install --upgrade grpcio-tools &&
pip3 install --upgrade pip

Client build requires recent version of CMake (FetchContent required)

Using CMAKE installation instruction from:: https://apt.kitware.com/

RUN apt update -q=2
&& apt install -y gpg wget
&& wget -O - https://apt.kitware.com/keys/kitware-archive-latest.asc 2>/dev/null | gpg --dearmor - | tee /usr/share/keyrings/kitware-archive-keyring.gpg >/dev/null
&& . /etc/os-release
&& echo “deb [signed-by=/usr/share/keyrings/kitware-archive-keyring.gpg] Kitware APT Repository $UBUNTU_CODENAME main” | tee /etc/apt/sources.list.d/kitware.list >/dev/null
&& apt-get update -q=2
&& apt-get install -y --no-install-recommends cmake=3.27.7* cmake-data=3.27.7*
&& cmake --version

Build expects “python” executable (not python3).

RUN rm -f /usr/bin/python &&
ln -s /usr/bin/python3 /usr/bin/python

Build the client library and examples

ARG TRITON_CLIENT_REPO_SUBDIR
ARG TRITON_COMMON_REPO_TAG
ARG TRITON_CORE_REPO_TAG
ARG TRITON_THIRD_PARTY_REPO_TAG
ARG TRITON_ENABLE_GPU
ARG JAVA_BINDINGS_MAVEN_VERSION
ARG JAVA_BINDINGS_JAVACPP_PRESETS_TAG
ARG TARGETPLATFORM

WORKDIR /workspace
COPY TRITON_VERSION .
COPY ${TRITON_CLIENT_REPO_SUBDIR} client

WORKDIR /workspace/build
RUN cmake -DCMAKE_INSTALL_PREFIX=/workspace/install
-DTRITON_VERSION=cat /workspace/TRITON_VERSION
-DTRITON_COMMON_REPO_TAG=${TRITON_COMMON_REPO_TAG}
-DTRITON_CORE_REPO_TAG=${TRITON_CORE_REPO_TAG}
-DTRITON_THIRD_PARTY_REPO_TAG=${TRITON_THIRD_PARTY_REPO_TAG}
-DTRITON_ENABLE_CC_HTTP=ON -DTRITON_ENABLE_CC_GRPC=ON
-DTRITON_ENABLE_PYTHON_HTTP=ON -DTRITON_ENABLE_PYTHON_GRPC=ON
-DTRITON_ENABLE_JAVA_HTTP=ON
-DTRITON_ENABLE_PERF_ANALYZER=ON
-DTRITON_ENABLE_PERF_ANALYZER_C_API=ON
-DTRITON_ENABLE_PERF_ANALYZER_TFS=ON
-DTRITON_ENABLE_PERF_ANALYZER_TS=ON
-DTRITON_ENABLE_EXAMPLES=ON -DTRITON_ENABLE_TESTS=ON
-DTRITON_ENABLE_GPU=${TRITON_ENABLE_GPU} /workspace/client
RUN make -j16 cc-clients python-clients java-clients &&
rm -fr ~/.m2

Install Java API Bindings

RUN if [ “$TARGETPLATFORM” = “linux/amd64” ]; then
source /workspace/client/src/java-api-bindings/scripts/install_dependencies_and_build.sh
–maven-version ${JAVA_BINDINGS_MAVEN_VERSION}
–core-tag ${TRITON_CORE_REPO_TAG}
–javacpp-tag ${JAVA_BINDINGS_JAVACPP_PRESETS_TAG}
–jar-install-path /workspace/install/java-api-bindings;
fi

############################################################################

Create sdk container

############################################################################
FROM ${BASE_IMAGE}

Ensure apt-get won’t prompt for selecting options

ENV DEBIAN_FRONTEND=noninteractive

ARG DCGM_VERSION
ARG TRITON_CORE_REPO_TAG
ARG TARGETPLATFORM
ARG TRITON_ENABLE_GPU

RUN apt-get update &&
apt-get install -y --no-install-recommends
software-properties-common
curl
git
gperf
libb64-dev
libgoogle-perftools-dev
libopencv-dev
libopencv-core-dev
libssl-dev
libtool
python3
python3-pip
python3-dev
vim
wget
python3-pdfkit
maven
default-jdk &&
pip3 install --upgrade wheel setuptools &&
pip3 install --upgrade grpcio-tools &&
pip3 install --upgrade pip

WORKDIR /workspace
COPY TRITON_VERSION .
COPY NVIDIA_Deep_Learning_Container_License.pdf .
COPY --from=sdk_build /workspace/client/ client/
COPY --from=sdk_build /workspace/install/ install/
RUN cd install &&
export VERSION=cat /workspace/TRITON_VERSION &&
tar zcf /workspace/v$VERSION.clients.tar.gz *

For CI testing need to copy over L0_sdk test and L0_client_build_variants test.

RUN mkdir qa
COPY qa/L0_sdk qa/L0_sdk
COPY qa/L0_client_build_variants qa/L0_client_build_variants

Create a directory for all the python client tests to enable unit testing

RUN mkdir -p qa/python_client_unit_tests/
COPY --from=sdk_build /workspace/client/src/python/library/tests/* qa/python_client_unit_tests/

Install an image needed by the quickstart and other documentation.

COPY qa/images/mug.jpg images/mug.jpg

Install the dependencies needed to run the client examples. These

are not needed for building but including them allows this image to

be used to run the client examples.

RUN pip3 install --upgrade numpy pillow attrdict &&
find install/python/ -maxdepth 1 -type f -name
“tritonclient-linux.whl” | xargs printf – ‘%s[all]’ |
xargs pip3 install --upgrade

Install DCGM

RUN if [ “$TRITON_ENABLE_GPU” = “ON” ]; then \

Here is the output I get when running “/opt/tritonserver/clients/bin/perf_analyzer -m simple” after running “sudo /opt/tritonserver/bin/tritonserver --model-repository=./model_repository --backend-directory=/opt/tritonserver/backends --backend-config=tensorflow,version=2”:

test@test-desktop:~/Downloads$ /opt/tritonserver/clients/bin/perf_analyzer -m simple
*** Measurement Settings ***
Batch size: 1
Service Kind: Triton
Using “time_windows” mode for stabilization
Measurement window: 5000 msec
Using synchronous calls for inference
Stabilizing using average latency

Request concurrency: 1
Client:
Request count: 14871
Throughput: 825.774 infer/sec
Avg latency: 1208 usec (standard deviation 469 usec)
p50 latency: 1106 usec
p90 latency: 1439 usec
p95 latency: 1677 usec
p99 latency: 3128 usec
Avg HTTP time: 1194 usec (send/recv 181 usec + response wait 1013 usec)
Server:
Inference count: 14871
Execution count: 14871
Successful request count: 14871
Avg request latency: 542 usec (overhead 63 usec + queue 65 usec + compute input 32 usec + compute infer 362 usec + compute output 19 usec)

Inferences/Second vs. Client Average Batch Latency
Concurrency: 1, throughput: 825.774 infer/sec, latency 1208 usec

Hi,

Thanks for sharing.

We are going to share an example for testing Tritonserver on Jetson with server and client app.
Hope this can help you and others to install and test Tritonserver easily.

Since tritonserver:24.01-py3-min doesn’t support JetPack 5, we may try this with an L4T base image. Does this work for you?

The reason is that Jetson has some specific hardware so L4T (linux4tegra) OS is required.
24.01 does support the Jetson environment but it uses JetPack 6 which is Ubuntu 22.04. while JetPack 5 is still Ubuntu 20.04.
Usually, a container cross branches will have issues when using CUDA or other onboard hardware.

Thanks.

Sure, that works for me.

Thanks.

Is there an outlook on when that example you’re sharing will be available?

Thank you

Hi,

We did test some available containers (ex. [dustynv/tritonserver:r35.4.1]) but failed with a missing libboost_filesystem library support.
(related topic: libboost_filesystem.so.1.80.0 on jetpack 5.1.2 · Issue #6844 · triton-inference-server/server · GitHub)

So we are now finding an alternative and checking internally.
Sorry to keep you waiting. Hope to share info with you soon.

Thanks

Hi,

Thanks a lot for your patience.

It turns out that the nvcr.io/nvidia/tritonserver container does work well on JetPack 5.
Please see below for the testing.

Server: tritonserver:24.02-py3-igpu

$ git clone -b r24.02 https://github.com/triton-inference-server/server.git
$ cd server/docs/examples/
$ ./fetch_models.sh 
$ sudo docker run -it --rm --runtime nvidia --network host -v ${PWD}/model_repository:/models nvcr.io/nvidia/tritonserver:24.02-py3-igpu tritonserver --model-repository=/models

You should see the backend and model logs like below:

...
I0327 04:32:46.516401 1 server.cc:634] 
+-------------+-----------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------+
| Backend     | Path                                                            | Config                                                                                                                          |
+-------------+-----------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------+
| tensorflow  | /opt/tritonserver/backends/tensorflow/libtriton_tensorflow.so   | {"cmdline":{"auto-complete-config":"true","backend-directory":"/opt/tritonserver/backends","min-compute-capability":"5.300000", |
|             |                                                                 | "default-max-batch-size":"4"}}                                                                                                  |
| onnxruntime | /opt/tritonserver/backends/onnxruntime/libtriton_onnxruntime.so | {"cmdline":{"auto-complete-config":"true","backend-directory":"/opt/tritonserver/backends","min-compute-capability":"5.300000", |
|             |                                                                 | "default-max-batch-size":"4"}}                                                                                                  |
+-------------+-----------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------+

I0327 04:32:46.516911 1 server.cc:677] 
+----------------------+---------+--------+
| Model                | Version | Status |
+----------------------+---------+--------+
| densenet_onnx        | 1       | READY  |
| inception_graphdef   | 1       | READY  |
| simple               | 1       | READY  |
| simple_dyna_sequence | 1       | READY  |
| simple_identity      | 1       | READY  |
| simple_int8          | 1       | READY  |
| simple_sequence      | 1       | READY  |
| simple_string        | 1       | READY  |
+----------------------+---------+--------+
...

Client: tritonserver:24.02-py3-igpu-sdk

You should be able to see the detection output by sending a query like below.
We test this on another XavierNX but it should be okay to run on the same device.

$ sudo docker run -it --rm --runtime nvidia --network host nvcr.io/nvidia/tritonserver:24.02-py3-igpu-sdk
# /workspace/install/bin/image_client -u [IP]:8000 -m densenet_onnx -c 3 -s INCEPTION /workspace/images/mug.jpg
Request 0, batch size 1
Image '/workspace/images/mug.jpg':
    15.349564 (504) = COFFEE MUG
    13.227465 (968) = CUP
    10.424894 (505) = COFFEEPOT

Thanks.

Thank you for the response! I gave this a try and it seemed to work on the SE70 in the same way that you shared in your output. I will try a few more things with this and I will let you know if I need any more assistance or if you can go ahead and close the thread.

Thanks again!