Installing Triton Server on Lenovo SE70 with Xavier NX

jyu35 · February 16, 2024, 5:41pm

I am trying to install Triton Server on the Lenovo SE70 with built-in Xavier NX, and I have been running into issues. The SE70 supports JP 4.5.1 on Ubuntu 18.04, JP 5.0.2 on Ubuntu 20.04, and JP 5.1.1 on Ubuntu 20.04.

Triton Server seems to be running fine on JP 4.5.1, but I want to use it on a later version of JP if possible. I’ve tried it on JP 5.0.2, but I ran into issues saying that the GPU isn’t being detected when starting the Triton Server docker container.

I tried installing the release that supports JP 5.1.1 and unfortunately, the runtime dependency “libre2-9” is only available on Ubuntu 22.04, which I can’t install onto the SE70.

I am concerned that I may not be installing correctly on JP 5.0.2, so could anyone provide some specific step-by-step instructions on how to install Triton Server on JP 5.0.2? Thank you.

AastaLLL · February 20, 2024, 4:22am

Hi,

Which container did you try to use?
The official Triton container needs a JetPack 6 environment.

For Xavier NX, please follow the below topic to install the packages:

Thanks.

jyu35 · February 20, 2024, 3:56pm

For JP 5.0.2 with Ubuntu 20.04, I tried container 22.10. For JP 5.1.1 with Ubuntu 20.04, I tried container 23.05.

I will try out those installation steps and post the result of trying them. Thanks in advance.

jyu35 · February 20, 2024, 5:57pm

Here is the output that I received when running the command below:

root@test-desktop:/home/test# /opt/tritonserver/bin/tritonserver --model-repository=./model_repository --backend-directory=/opt/tritonserver/backends --backend-config=tensorflow,version=2
I0220 16:33:06.853942 3321010 pinned_memory_manager.cc:240] Pinned memory pool is created at ‘0x202f0a000’ with size 268435456
I0220 16:33:06.854632 3321010 cuda_memory_manager.cc:105] CUDA memory pool is created on device 0 with size 67108864
I0220 16:33:06.892931 3321010 model_lifecycle.cc:459] loading: simple:1
I0220 16:33:07.633477 3321010 tensorflow.cc:2577] TRITONBACKEND_Initialize: tensorflow
I0220 16:33:07.633592 3321010 tensorflow.cc:2587] Triton TRITONBACKEND API version: 1.12
I0220 16:33:07.633645 3321010 tensorflow.cc:2593] ‘tensorflow’ TRITONBACKEND API version: 1.12
I0220 16:33:07.633684 3321010 tensorflow.cc:2617] backend configuration:
{“cmdline”:{“auto-complete-config”:“true”,“backend-directory”:“/opt/tritonserver/backends”,“min-compute-capability”:“5.300000”,“version”:“2”,“default-max-batch-size”:“4”}}
I0220 16:33:07.634156 3321010 tensorflow.cc:2683] TRITONBACKEND_ModelInitialize: simple (version 1)
I0220 16:33:07.636693 3321010 tensorflow.cc:2732] TRITONBACKEND_ModelInstanceInitialize: simple (GPU device 0)
2024-02-20 11:33:09.827418: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:999] could not open file to read NUMA node: /sys/bus/pci/devices/0000:00:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-02-20 11:33:10.097072: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:999] could not open file to read NUMA node: /sys/bus/pci/devices/0000:00:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-02-20 11:33:10.097503: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:999] could not open file to read NUMA node: /sys/bus/pci/devices/0000:00:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-02-20 11:33:10.098559: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:999] could not open file to read NUMA node: /sys/bus/pci/devices/0000:00:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-02-20 11:33:10.099073: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:999] could not open file to read NUMA node: /sys/bus/pci/devices/0000:00:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-02-20 11:33:10.099300: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1725] Could not identify NUMA node of platform GPU id 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2024-02-20 11:33:10.099547: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:999] could not open file to read NUMA node: /sys/bus/pci/devices/0000:00:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-02-20 11:33:10.099806: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1638] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 863 MB memory: → device: 0, name: Xavier, pci bus id: 0000:00:00.0, compute capability: 7.2
2024-02-20 11:33:10.209036: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:353] MLIR V1 optimization pass is not enabled
I0220 16:33:10.211611 3321010 model_lifecycle.cc:694] successfully loaded ‘simple’ version 1
*I0220 16:33:10.211947 3321010 server.cc:583] *
±-----------------±-----+
| Repository Agent | Path |
±-----------------±-----+
±-----------------±-----+

*I0220 16:33:10.212373 3321010 server.cc:610] *
±-----------±--------------------------------------------------------------±--------------------------------------------------------------+
| Backend | Path | Config |
±-----------±--------------------------------------------------------------±--------------------------------------------------------------+
| tensorflow | /opt/tritonserver/backends/tensorflow/libtriton_tensorflow.so | {“cmdline”:{“auto-complete-config”:“true”,“backend-directory” |
| | | :“/opt/tritonserver/backends”,“min-compute-capability”:"5.300 |
| | | 000",“version”:“2”,“default-max-batch-size”:“4”}} |
±-----------±--------------------------------------------------------------±--------------------------------------------------------------+

W0220 16:33:10.214026 3321010 metrics.cc:352] No polling metrics (CPU, GPU) are enabled. Will not poll for them.
*I0220 16:33:10.214766 3321010 tritonserver.cc:2387] *
±---------------------------------±---------------------------------------------------------------------------------------------------------+
| Option | Value |
±---------------------------------±---------------------------------------------------------------------------------------------------------+
| server_id | triton |
| server_version | 2.33.0 |
| server_extensions | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_confi |
| | guration system_shared_memory cuda_shared_memory binary_tensor_data parameters statistics trace logging |
| model_repository_path[0] | ./model_repository |
| model_control_mode | MODE_NONE |
| strict_model_config | 0 |
| rate_limit | OFF |
| pinned_memory_pool_byte_size | 268435456 |
| cuda_memory_pool_byte_size{0} | 67108864 |
| min_supported_compute_capability | 5.3 |
| strict_readiness | 1 |
| exit_timeout | 30 |
| cache_enabled | 0 |
±---------------------------------±---------------------------------------------------------------------------------------------------------+

I0220 16:33:10.231070 3321010 grpc_server.cc:2450] Started GRPCInferenceService at 0.0.0.0:8001
I0220 16:33:10.232026 3321010 http_server.cc:3555] Started HTTPService at 0.0.0.0:8000
I0220 16:33:10.278758 3321010 http_server.cc:185] Started Metrics Service at 0.0.0.0:8002

In a separate terminal, I ran the command below and got this output:

root@test-desktop:/home/test# /opt/tritonserver/clients/bin/perf_analyzer -m simple
**** Measurement Settings ****

Batch size: 1*
Service Kind: Triton*
Using “time_windows” mode for stabilization*
Measurement window: 5000 msec*
Using synchronous calls for inference*
Stabilizing using average latency*

Request concurrency: 1

Client: *
Request count: 14448*
Throughput: 802.223 infer/sec*
Avg latency: 1244 usec (standard deviation 434 usec)*
p50 latency: 1150 usec*
p90 latency: 1478 usec*
p95 latency: 1699 usec*
p99 latency: 3031 usec*
Avg HTTP time: 1231 usec (send/recv 190 usec + response wait 1041 usec)*
Server: *
Inference count: 14449*
Execution count: 14449*
Successful request count: 14449*
Avg request latency: 553 usec (overhead 65 usec + queue 67 usec + compute input 34 usec + compute infer 366 usec + compute output 19 usec)*

Inferences/Second vs. Client Average Batch Latency
Concurrency: 1, throughput: 802.223 infer/sec, latency 1244 usec

Does this indicate that Triton Server is running?

AastaLLL · February 22, 2024, 2:47am

Hi,

Yes, it looks like the server is running.
You can further verify it with the below example:

https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/getting_started/quickstart.html

Thanks.

jyu35 · February 22, 2024, 2:27pm

Thank you for the help!

Is my assumption correct that this container that I followed the installation steps in the other post you referred me to supports the use of docker?

AastaLLL · February 23, 2024, 2:52am

Hi,

Yes, but since the triton server container requires a JetPack 6 environment.
You will need to build a custom one on the top of the container that supports JetPack 5.

Maybe you can try nvcr.io/nvidia/l4t-ml:r35.2.1-py3.
It has all the common ML frameworks pre-installed that can be used as a Triton backend.
However, the container size is relatively large due to these libraries.

Thanks.

jyu35 · February 27, 2024, 7:20pm

Once I have built the container that supports Jetpack 5, how do I verify that it is working in coordination with Triton Server?

AastaLLL · February 29, 2024, 7:52am

Hi,

You can test it with the sample shared in the below link:

https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/getting_started/quickstart.html

Thanks.

jyu35 · March 1, 2024, 5:28pm

Why do I receive a warning saying that the NVIDIA Driver was not detected? I thought the NVIDIA driver was included with the image.

test@test-desktop:~/Downloads$ sudo docker run -it --rm --net=host nvcr.io/nvidia/tritonserver:23.04-py3-sdk
[sudo] password for test:

=================================
== Triton Inference Server SDK ==

NVIDIA Release 23.04 (build 58408270)

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:

WARNING: The NVIDIA Driver was not detected. GPU functionality will not be available.
Use the NVIDIA Container Toolkit to start this container with GPU support; see
NVIDIA Cloud Native Technologies - NVIDIA Docs .

AastaLLL · March 4, 2024, 7:09am

Hi,

Do you need the client app to run on Jetson?
Since the tritonserver container cannot work on JetPack 5 as previously mentioned.

Thanks.

jyu35 · March 4, 2024, 5:52pm

Yes, I need the client app to run on Jetson. I installed that ML L4T framework that you’ve mentioned to build a custom container that supports Jetpack 5 on top of the Triton Server Container and was trying to send a sample inference request using the link you sent.

Sorry, as I am fairly new to all of this.

AastaLLL · March 5, 2024, 6:01am

Hi,

Would you mind sharing the Dockerfile with us for checking?

The Tritonserver container doesn’t support JetPack 5.
Does the container work well with the test shared in the below topic:
/opt/tritonserver/clients/bin/perf_analyzer -m simple

Thanks.

jyu35 · March 5, 2024, 3:36pm

Here is the dockerfile contents:

Copyright 2019-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

Redistribution and use in source and binary forms, with or without

modification, are permitted provided that the following conditions

are met:

* Redistributions of source code must retain the above copyright

notice, this list of conditions and the following disclaimer.

* Redistributions in binary form must reproduce the above copyright

notice, this list of conditions and the following disclaimer in the

documentation and/or other materials provided with the distribution.

* Neither the name of NVIDIA CORPORATION nor the names of its

contributors may be used to endorse or promote products derived

from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS ``AS IS’’ AND ANY

EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE

IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR

PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR

CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,

EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,

PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR

PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY

OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT

(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE

OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Multistage build.

Base image on the minimum Triton container

ARG BASE_IMAGE=nvcr.io/nvidia/tritonserver:24.01-py3-min

ARG TRITON_CLIENT_REPO_SUBDIR=clientrepo
ARG TRITON_COMMON_REPO_TAG=main
ARG TRITON_CORE_REPO_TAG=main
ARG TRITON_THIRD_PARTY_REPO_TAG=main
ARG TRITON_MODEL_ANALYZER_REPO_TAG=main
ARG TRITON_ENABLE_GPU=ON
ARG JAVA_BINDINGS_MAVEN_VERSION=3.8.4
ARG JAVA_BINDINGS_JAVACPP_PRESETS_TAG=1.5.8

DCGM version to install for Model Analyzer

ARG DCGM_VERSION=3.2.6

ARG NVIDIA_TRITON_SERVER_SDK_VERSION=unknown
ARG NVIDIA_BUILD_ID=unknown

############################################################################

Build image

############################################################################

FROM ${BASE_IMAGE} AS sdk_build

Ensure apt-get won’t prompt for selecting options

ENV DEBIAN_FRONTEND=noninteractive

RUN apt-get update &&
apt-get install -y --no-install-recommends
ca-certificates
software-properties-common
autoconf
automake
build-essential
curl
git
gperf
libb64-dev
libgoogle-perftools-dev
libopencv-dev
libopencv-core-dev
libssl-dev
libtool
pkg-config
python3
python3-pip
python3-dev
rapidjson-dev
vim
wget
python3-pdfkit
openjdk-11-jdk
maven &&
pip3 install --upgrade wheel setuptools &&
pip3 install --upgrade grpcio-tools &&
pip3 install --upgrade pip

Client build requires recent version of CMake (FetchContent required)

Using CMAKE installation instruction from:: https://apt.kitware.com/

RUN apt update -q=2
&& apt install -y gpg wget
&& wget -O - https://apt.kitware.com/keys/kitware-archive-latest.asc 2>/dev/null | gpg --dearmor - | tee /usr/share/keyrings/kitware-archive-keyring.gpg >/dev/null
&& . /etc/os-release
&& echo “deb [signed-by=/usr/share/keyrings/kitware-archive-keyring.gpg] Kitware APT Repository $UBUNTU_CODENAME main” | tee /etc/apt/sources.list.d/kitware.list >/dev/null
&& apt-get update -q=2
&& apt-get install -y --no-install-recommends cmake=3.27.7* cmake-data=3.27.7*
&& cmake --version

Build expects “python” executable (not python3).

RUN rm -f /usr/bin/python &&
ln -s /usr/bin/python3 /usr/bin/python

Build the client library and examples

ARG TRITON_CLIENT_REPO_SUBDIR
ARG TRITON_COMMON_REPO_TAG
ARG TRITON_CORE_REPO_TAG
ARG TRITON_THIRD_PARTY_REPO_TAG
ARG TRITON_ENABLE_GPU
ARG JAVA_BINDINGS_MAVEN_VERSION
ARG JAVA_BINDINGS_JAVACPP_PRESETS_TAG
ARG TARGETPLATFORM

WORKDIR /workspace
COPY TRITON_VERSION .
COPY ${TRITON_CLIENT_REPO_SUBDIR} client

WORKDIR /workspace/build
RUN cmake -DCMAKE_INSTALL_PREFIX=/workspace/install
-DTRITON_VERSION=cat /workspace/TRITON_VERSION
-DTRITON_COMMON_REPO_TAG=${TRITON_COMMON_REPO_TAG}
-DTRITON_CORE_REPO_TAG=${TRITON_CORE_REPO_TAG}
-DTRITON_THIRD_PARTY_REPO_TAG=${TRITON_THIRD_PARTY_REPO_TAG}
-DTRITON_ENABLE_CC_HTTP=ON -DTRITON_ENABLE_CC_GRPC=ON
-DTRITON_ENABLE_PYTHON_HTTP=ON -DTRITON_ENABLE_PYTHON_GRPC=ON
-DTRITON_ENABLE_JAVA_HTTP=ON
-DTRITON_ENABLE_PERF_ANALYZER=ON
-DTRITON_ENABLE_PERF_ANALYZER_C_API=ON
-DTRITON_ENABLE_PERF_ANALYZER_TFS=ON
-DTRITON_ENABLE_PERF_ANALYZER_TS=ON
-DTRITON_ENABLE_EXAMPLES=ON -DTRITON_ENABLE_TESTS=ON
-DTRITON_ENABLE_GPU=${TRITON_ENABLE_GPU} /workspace/client
RUN make -j16 cc-clients python-clients java-clients &&
rm -fr ~/.m2

Install Java API Bindings

RUN if [ “$TARGETPLATFORM” = “linux/amd64” ]; then
source /workspace/client/src/java-api-bindings/scripts/install_dependencies_and_build.sh
–maven-version ${JAVA_BINDINGS_MAVEN_VERSION}
–core-tag ${TRITON_CORE_REPO_TAG}
–javacpp-tag ${JAVA_BINDINGS_JAVACPP_PRESETS_TAG}
–jar-install-path /workspace/install/java-api-bindings;
fi

############################################################################

Create sdk container

############################################################################
FROM ${BASE_IMAGE}

Ensure apt-get won’t prompt for selecting options

ENV DEBIAN_FRONTEND=noninteractive

ARG DCGM_VERSION
ARG TRITON_CORE_REPO_TAG
ARG TARGETPLATFORM
ARG TRITON_ENABLE_GPU

RUN apt-get update &&
apt-get install -y --no-install-recommends
software-properties-common
curl
git
gperf
libb64-dev
libgoogle-perftools-dev
libopencv-dev
libopencv-core-dev
libssl-dev
libtool
python3
python3-pip
python3-dev
vim
wget
python3-pdfkit
maven
default-jdk &&
pip3 install --upgrade wheel setuptools &&
pip3 install --upgrade grpcio-tools &&
pip3 install --upgrade pip

WORKDIR /workspace
COPY TRITON_VERSION .
COPY NVIDIA_Deep_Learning_Container_License.pdf .
COPY --from=sdk_build /workspace/client/ client/
COPY --from=sdk_build /workspace/install/ install/
RUN cd install &&
export VERSION=cat /workspace/TRITON_VERSION &&
tar zcf /workspace/v$VERSION.clients.tar.gz *

For CI testing need to copy over L0_sdk test and L0_client_build_variants test.

RUN mkdir qa
COPY qa/L0_sdk qa/L0_sdk
COPY qa/L0_client_build_variants qa/L0_client_build_variants

Create a directory for all the python client tests to enable unit testing

RUN mkdir -p qa/python_client_unit_tests/
COPY --from=sdk_build /workspace/client/src/python/library/tests/* qa/python_client_unit_tests/

Install an image needed by the quickstart and other documentation.

COPY qa/images/mug.jpg images/mug.jpg

Install the dependencies needed to run the client examples. These

are not needed for building but including them allows this image to

be used to run the client examples.

RUN pip3 install --upgrade numpy pillow attrdict &&
find install/python/ -maxdepth 1 -type f -name
“tritonclient-linux.whl” | xargs printf – ‘%s[all]’ |
xargs pip3 install --upgrade

Install DCGM

RUN if [ “$TRITON_ENABLE_GPU” = “ON” ]; then \

Here is the output I get when running “/opt/tritonserver/clients/bin/perf_analyzer -m simple” after running “sudo /opt/tritonserver/bin/tritonserver --model-repository=./model_repository --backend-directory=/opt/tritonserver/backends --backend-config=tensorflow,version=2”:

test@test-desktop:~/Downloads$ /opt/tritonserver/clients/bin/perf_analyzer -m simple
*** Measurement Settings ***
Batch size: 1
Service Kind: Triton
Using “time_windows” mode for stabilization
Measurement window: 5000 msec
Using synchronous calls for inference
Stabilizing using average latency

Request concurrency: 1
Client:
Request count: 14871
Throughput: 825.774 infer/sec
Avg latency: 1208 usec (standard deviation 469 usec)
p50 latency: 1106 usec
p90 latency: 1439 usec
p95 latency: 1677 usec
p99 latency: 3128 usec
Avg HTTP time: 1194 usec (send/recv 181 usec + response wait 1013 usec)
Server:
Inference count: 14871
Execution count: 14871
Successful request count: 14871
Avg request latency: 542 usec (overhead 63 usec + queue 65 usec + compute input 32 usec + compute infer 362 usec + compute output 19 usec)

Inferences/Second vs. Client Average Batch Latency
Concurrency: 1, throughput: 825.774 infer/sec, latency 1208 usec

AastaLLL · March 6, 2024, 1:57am

Hi,

Thanks for sharing.

We are going to share an example for testing Tritonserver on Jetson with server and client app.
Hope this can help you and others to install and test Tritonserver easily.

Since tritonserver:24.01-py3-min doesn’t support JetPack 5, we may try this with an L4T base image. Does this work for you?

The reason is that Jetson has some specific hardware so L4T (linux4tegra) OS is required.
24.01 does support the Jetson environment but it uses JetPack 6 which is Ubuntu 22.04. while JetPack 5 is still Ubuntu 20.04.
Usually, a container cross branches will have issues when using CUDA or other onboard hardware.

Thanks.

jyu35 · March 6, 2024, 2:07pm

Sure, that works for me.

Thanks.

jyu35 · March 18, 2024, 3:51pm

Is there an outlook on when that example you’re sharing will be available?

Thank you

AastaLLL · March 20, 2024, 9:04am

Hi,

We did test some available containers (ex. [dustynv/tritonserver:r35.4.1]) but failed with a missing libboost_filesystem library support.
(related topic: libboost_filesystem.so.1.80.0 on jetpack 5.1.2 · Issue #6844 · triton-inference-server/server · GitHub)

So we are now finding an alternative and checking internally.
Sorry to keep you waiting. Hope to share info with you soon.

Thanks

AastaLLL · March 27, 2024, 5:24am

Hi,

Thanks a lot for your patience.

It turns out that the nvcr.io/nvidia/tritonserver container does work well on JetPack 5.
Please see below for the testing.

Server: tritonserver:24.02-py3-igpu

$ git clone -b r24.02 https://github.com/triton-inference-server/server.git
$ cd server/docs/examples/
$ ./fetch_models.sh 
$ sudo docker run -it --rm --runtime nvidia --network host -v ${PWD}/model_repository:/models nvcr.io/nvidia/tritonserver:24.02-py3-igpu tritonserver --model-repository=/models

You should see the backend and model logs like below:

...
I0327 04:32:46.516401 1 server.cc:634] 
+-------------+-----------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------+
| Backend     | Path                                                            | Config                                                                                                                          |
+-------------+-----------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------+
| tensorflow  | /opt/tritonserver/backends/tensorflow/libtriton_tensorflow.so   | {"cmdline":{"auto-complete-config":"true","backend-directory":"/opt/tritonserver/backends","min-compute-capability":"5.300000", |
|             |                                                                 | "default-max-batch-size":"4"}}                                                                                                  |
| onnxruntime | /opt/tritonserver/backends/onnxruntime/libtriton_onnxruntime.so | {"cmdline":{"auto-complete-config":"true","backend-directory":"/opt/tritonserver/backends","min-compute-capability":"5.300000", |
|             |                                                                 | "default-max-batch-size":"4"}}                                                                                                  |
+-------------+-----------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------+

I0327 04:32:46.516911 1 server.cc:677] 
+----------------------+---------+--------+
| Model                | Version | Status |
+----------------------+---------+--------+
| densenet_onnx        | 1       | READY  |
| inception_graphdef   | 1       | READY  |
| simple               | 1       | READY  |
| simple_dyna_sequence | 1       | READY  |
| simple_identity      | 1       | READY  |
| simple_int8          | 1       | READY  |
| simple_sequence      | 1       | READY  |
| simple_string        | 1       | READY  |
+----------------------+---------+--------+
...

Client: tritonserver:24.02-py3-igpu-sdk

You should be able to see the detection output by sending a query like below.
We test this on another XavierNX but it should be okay to run on the same device.

$ sudo docker run -it --rm --runtime nvidia --network host nvcr.io/nvidia/tritonserver:24.02-py3-igpu-sdk
# /workspace/install/bin/image_client -u [IP]:8000 -m densenet_onnx -c 3 -s INCEPTION /workspace/images/mug.jpg
Request 0, batch size 1
Image '/workspace/images/mug.jpg':
    15.349564 (504) = COFFEE MUG
    13.227465 (968) = CUP
    10.424894 (505) = COFFEEPOT

Thanks.

jyu35 · April 1, 2024, 3:31pm

Thank you for the response! I gave this a try and it seemed to work on the SE70 in the same way that you shared in your output. I will try a few more things with this and I will let you know if I need any more assistance or if you can go ahead and close the thread.

Thanks again!

Topic		Replies	Views
JetPack 4.6 Production Release with L4T 32.6.1 Jetson Nano	47	11988	March 10, 2022
Triton container is based on Ubuntu 20.04, all the others are based on Ubuntu 18.04 DeepStream SDK	7	1033	December 7, 2021
DeepStream 6.0.1 Triton GRPC memory leak DeepStream SDK nvbugs	23	2746	September 2, 2022
Triton Server Crashing Running Centerpoint Keypoint (hourglass_512x512_kpts) on Jetson via Dockerized Triton Jetson TX2 jetson-inference , docker , inference-server-triton	6	1168	February 9, 2022
Run Triton kernels on Jetson AGX Orin Jetson AGX Orin inference-server-triton	14	3330	June 14, 2023
GRPC Data Corruption/Issue with Yolo Object Detection with Triton on Jetson DeepStream SDK	20	658	June 25, 2024
Triton Inference Server not supporting PyTorch v1.6? DeepStream SDK pytorch , inference-server-triton	13	2250	October 12, 2021
Custom Detection parser error with nvinferserver and custom python model with > 1 streams DeepStream SDK inference-server-triton , gpu , deepstream	18	1088	September 4, 2023
Jetson running Docker with DeepStream 6.0 and Triton Server DeepStream SDK docker , inference-server-triton , jetson	6	2041	November 19, 2021
Regarding when we execute triton server on jetson orin getting an error unable to load model DeepStream SDK cuda	19	718	July 30, 2024

Installing Triton Server on Lenovo SE70 with Xavier NX

================================= == Triton Inference Server SDK ==

Copyright 2019-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

Redistribution and use in source and binary forms, with or without

modification, are permitted provided that the following conditions

are met:

* Redistributions of source code must retain the above copyright

notice, this list of conditions and the following disclaimer.

* Redistributions in binary form must reproduce the above copyright

notice, this list of conditions and the following disclaimer in the

documentation and/or other materials provided with the distribution.

* Neither the name of NVIDIA CORPORATION nor the names of its

contributors may be used to endorse or promote products derived

from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS ``AS IS’’ AND ANY

EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE

IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR

PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR

CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,

EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,

PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR

PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY

OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT

(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE

OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Multistage build.

Base image on the minimum Triton container

DCGM version to install for Model Analyzer

Build image

Ensure apt-get won’t prompt for selecting options

Client build requires recent version of CMake (FetchContent required)

Using CMAKE installation instruction from:: https://apt.kitware.com/

Build expects “python” executable (not python3).

Build the client library and examples

Install Java API Bindings

Create sdk container

Ensure apt-get won’t prompt for selecting options

For CI testing need to copy over L0_sdk test and L0_client_build_variants test.

Create a directory for all the python client tests to enable unit testing

Install an image needed by the quickstart and other documentation.

Install the dependencies needed to run the client examples. These

are not needed for building but including them allows this image to

be used to run the client examples.

Install DCGM

Server: tritonserver:24.02-py3-igpu

Client: tritonserver:24.02-py3-igpu-sdk

Related topics

=================================
== Triton Inference Server SDK ==