ONNX fails to load

user62193 · September 22, 2024, 1:19pm

Description

Running Triton quickstart:
https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/getting_started/quickstart.html

on a 16GB Jetson Orin with docker command:
docker run --gpus all --runtime=nvidia --rm -p8000:8000 -p8001:8001 -p8002:8002 -v/home/nvidia/server/docs/examples/model_repository:/models nvcr.io/nvidia/tritonserver:23.09-py3 tritonserver --model-repository=/models

Receive:
------------------------------------------------------------------------------------------------------------------------+
| densenet_onnx | 1 | UNAVAILABLE: Internal: onnx runtime error 6: Exception during initialization: /workspace/onnxruntime/onnxruntime/core/providers/cuda/cuda_call.cc:121 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cublasStatus_t; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] /workspace/onnxruntime/onnxruntime/co |
| | | re/providers/cuda/cuda_call.cc:114 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cublasStatus_t; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] CUBLAS failure 3: CUBLAS_STATUS_ALLOC_FAILED ; GPU=0 ; hostname=a43d498cbac4 ; file=/workspace/onnxruntime/onnxruntime/core/providers/cuda/cuda_executio |
| | | n_provider.cc ; line=168 ; expr=cublasCreate(&cublas_handle_);

Environment

TensorRT Version: 8.6.2.3
GPU Type:
Nvidia Driver Version:
CUDA Version: 12.2.140
CUDNN Version: 8.9.4.25
Operating System + Version: Jetpack 6 L4T 36.3.0
Python Version (if applicable):
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag):

Relevant Files

Please attach or include links to any models, data, files, or scripts necessary to reproduce your issue. (Github repo, Google Drive, Dropbox, etc.)

Steps To Reproduce

Please include:

Exact steps/commands to build your repro
Exact steps/commands to run your repro
Full traceback of errors encountered

AakankshaS · September 23, 2024, 8:44am

Hi @user62193 ,
Request you to reach out to Jetson or Triton Forum for better assistance on the topic.

Thanks

user62193 · September 23, 2024, 10:03am

Thanks. Actually just managed to solve this by using an igpu image!

docker run --gpus=1 --runtime=nvidia --rm -p8000:8000 -p8001:8001 -p8002:8002 -v/home/nvidia/server/docs/examples/model_repository:/models nvcr.io/nvidia/tritonserver:24.05-py3-igpu tritonserver --model-repository=/models --backend-config=tensorrt,–version-compatible=true

Topic		Replies	Views
Problem with onnxruntime in docker image Jetson Orin NX cudnn , onnx	4	245	March 6, 2025
Onnxruntime error Jetson Nano cuda , pytorch , onnx	9	7049	October 10, 2021
Facing failed to load 'yolo' version 1: Internal: onnx runtime error 1: Load model from /data/yolo/1/best.onnx failed:Fatal error: TRT:EfficientNMS_T TensorRT onnx , inference-server-triton	1	441	May 31, 2024
Could not parse ONNX model (2) TensorRT cudnn	0	403	February 5, 2024
CUDNN_STATUS_EXECUTION_FAILED error on orin Jetson Nano tensorrt , cuda , kernel	4	593	March 7, 2023
Build ONNXInference-gpu wheel for Jetpack5 with Cuda and TRT Jetson AGX Orin tensorrt , cuda , onnx	6	2774	August 10, 2022
Issue running ONNXRUNTIME in Jetson Jetson TX2 onnx	5	1554	February 9, 2022
Onnxruntime in Docker cuDNN	2	4722	April 24, 2023
Deepstream dGPU Triton Python Bindings OpenCV ONNX DeepStream SDK opencv , inference-server-triton	4	1653	October 12, 2021
Erorr with onnx to trt Jetson Xavier NX tensorrt	8	1274	March 30, 2022

ONNX fails to load

Description

Environment

Relevant Files

Steps To Reproduce

Related topics