Slow YOLOv8 inference speed in Jetson Orin Nano (had trouble with pytorch and torchvision installation)

Hi,

The tutorial is for JetPack 4 which uses CUDA 10.2.
But Orin Nano use JetPack 5 and CUDA 11.4.

Would you mind trying the below container (l4t-pytorch:r35.2.1-pth2.0-py3) to see if it can work on GPU?

Thanks.