Ultralytics on nvidia docker - jetson orin nx

Hello, I am running nvidia docker on a jetson orin nx.
I think I pulled this version : docker pull nvidia/cuda-arm64:11.4.0-base-ubuntu20.04

My host machine is installed with jetpack 5.1.1, installed with the following
cuda: 11.4.r11.4/compiler.31964100_0
VPI: 2.2.7
Vulkan: 1.3.204
OpenCV: 4.2.0 with CUDA: NO

The environment on docker container is like the following
CUDA: 11.4.r11.4/compiler.30521435_0
pytorch: 2.0.0a0+fe05266f.nv23.04
torchvision: 0.15.1a0

On the docker container, I installed ros2-foxy and ultralytics. I am trying to run a real-time pose detection with my usb camera using GPU acceleration.

I already checked that when running with cpu (without --runtime nvidia option when running the docker image) my code works perfectly. But since the FPS is too low with cpu, I am trying to use jetson gpu orin.

I checked that torch.cuda.is_available() returns True, so it seems the cuda version and my pytorch version are compatible.

However, my code is stuck when executing model.predict(frame)[0], where model = YOLO(“yolov8n-pose.pt”).
There is no returning error message, it takes about 5 minutes for the code return RuntimeError: CUDA error cublas_status_execution_failed when calling cublassgemm

I followed this link Run Yolo8 in GPU · Issue #3084 · ultralytics/ultralytics · GitHub for my code.
using this line:
device: str = ‘cuda’ if torch.cuda.is_available() else “cpu”
model = YOLO(“yolov8n-pose.pt”)

I cannot find any solution to resolve my issue… Can anyone help?


Please use the container with the l4t tag on Jetson.
The arm container is usually for the SBSA server.

You can try it with the l4t-pytorch:r35.2.1-pth2.0-py3 base image.
The container has PyTorch preinstalled.


This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.