Hello, I am running nvidia docker on a jetson orin nx.
I think I pulled this version : docker pull nvidia/cuda-arm64:11.4.0-base-ubuntu20.04
My host machine is installed with jetpack 5.1.1, installed with the following
cuda: 11.4.r11.4/compiler.31964100_0
cuDNN: 8.6.0.166
TensorRT: 8.5.2.2
VPI: 2.2.7
Vulkan: 1.3.204
OpenCV: 4.2.0 with CUDA: NO
The environment on docker container is like the following
CUDA: 11.4.r11.4/compiler.30521435_0
pytorch: 2.0.0a0+fe05266f.nv23.04
torchvision: 0.15.1a0
On the docker container, I installed ros2-foxy and ultralytics. I am trying to run a real-time pose detection with my usb camera using GPU acceleration.
I already checked that when running with cpu (without --runtime nvidia option when running the docker image) my code works perfectly. But since the FPS is too low with cpu, I am trying to use jetson gpu orin.
I checked that torch.cuda.is_available() returns True, so it seems the cuda version and my pytorch version are compatible.
However, my code is stuck when executing model.predict(frame)[0], where model = YOLO(“yolov8n-pose.pt”).
There is no returning error message, it takes about 5 minutes for the code return RuntimeError: CUDA error cublas_status_execution_failed when calling cublassgemm
I followed this link Run Yolo8 in GPU · Issue #3084 · ultralytics/ultralytics · GitHub for my code.
using this line:
device: str = ‘cuda’ if torch.cuda.is_available() else “cpu”
model = YOLO(“yolov8n-pose.pt”)
model.to(device)
I cannot find any solution to resolve my issue… Can anyone help?