Orin AGX run YOLOV5 detect.py,ERROR MSG "RuntimeError: Couldn't load custom C++ ops..."

When I clone yolov5 source code in orin agx ubuntu system r35.1, and I install pytorch[Pytorch for Jetson] and torchvision, and then run “pip install -r requirements.txt” to install yolov5 dependencies, “python detect.py” and get an error message like the attach pictures.


Based on the document below, you will need to use TorchVision 0.12.0 for PyTorch 1.11.0.
Please downgrade the TorchVision and try it again.


I got a another error on AGX Orin r35.1. I’m running inside of the default containers (GitHub - dusty-nv/jetson-containers: Machine Learning Containers for NVIDIA Jetson and JetPack-L4T) r35.1.0-pth1.11-py3. I tried on r35.1.0-pth1.12-py3** but no success.

What’s wrong ? Any tips ?

Fusing layers... 
YOLOv5s summary: 213 layers, 7225885 parameters, 0 gradients
Adding AutoShape... 
Traceback (most recent call last):
  File "test_yolo.py", line 10, in <module>
    results = model(img)
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1129, in _call_impl
    return forward_call(*input, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/root/.cache/torch/hub/ultralytics_yolov5_master/models/common.py", line 642, in forward
    y = non_max_suppression(y if self.dmb else y[0],
  File "/root/.cache/torch/hub/ultralytics_yolov5_master/utils/general.py", line 885, in non_max_suppression
    i = torchvision.ops.nms(boxes, scores, iou_thres)  # NMS
  File "/usr/local/lib/python3.8/dist-packages/torchvision-0.13.0a0+da3794e-py3.8-linux-aarch64.egg/torchvision/ops/boxes.py", line 41, in nms
    return torch.ops.torchvision.nms(boxes, scores, iou_threshold)
  File "/usr/local/lib/python3.8/dist-packages/torch/_ops.py", line 142, in __call__
    return self._op(*args, **kwargs or {})
**RuntimeError: CUDA error: no kernel image is available for execution on the device**
**CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.**
**For debugging consider passing CUDA_LAUNCH_BLOCKING=1.**

Hi @tnferreira, please see my reply to your other post here:

I deal my probleam with a solution , herer Can I execute yolov5 on the GPU of JETSON AGX XAVIER? - #5 by k-hamada