Hardware - GPU (A6000)
Hardware - CPU
Operating System - custom yocto build, architecture: aarch64-poky-linux
I am unable to access CUDA using Pytorch. When I check torch.cuda.is_available(), I always get “False” although I have CUDA installed.
Also, I am unable to install libtorch on it. I get an error saying file format is wrong.
Hi there, with aarch64 + discrete GPU, there isn’t a prebuilt PyTorch wheel available. You could potentially try the NGC PyTorch container although it has not been fully tested on Poky. Frameworks Support Matrix - NVIDIA Docs shows it has been tested on Ubuntu.
If that doesn’t work then the way to go would be build from source.
The 0.6.0 release includes PyTorch support for libtorch but it does not include the python components. In your use case, are you utilizing the Torch backend for InferenceOp or do you have native Holoscan python operators that use PyTorch functionalities?
I am using torch backend for inference operation. So I am using nvidia nemo models for inference. But those models are unable to use GPU (CUDA isn’t available) so they run on CPU which is very slow.
No, the CPU inference is not limited by the model, it is limited by running on aarch64-poky-linux with dGPU. I am able to run the model on the GPU of a x86_64 machine.
I have not yet added the pytorch recipe to my yocto image. I am still using 0.5.1 without the pytorch recipe. I should have framed it in a better way, my bad!
The PyTorch Python support is not tied to the SDK release like the libtorch support for InferenceOp is tied to v0.6. Our ssd_detection_endoscopy_tools app on HoloHub contains some examples of using PyTorch functionalities in native Python Holoscan operators.
The difficulty here may be platform PyTorch support itself, not related to Holoscan SDK - if you could use PyTorch in regular Python on your platform, you could use PyTorch in native Python Holoscan operators.
Since there are no prebuilt PyTorch wheels / packages for aarch64+dGPU, NVIDIA has released NGC PyTorch container for aarch64+dGPU. In Frameworks Support Matrix - NVIDIA Docs it states “Container OS” is Ubuntu, it should work for Poky OS but it hasn’t been fully tested. As a side note, if you were to use the devkit with L4T Ubuntu as a development machine instead of with the deployment stack, you could definitely use this container for PyTorch support in Python.
Edited for my above answers: the NGC PyTorch container has been tested for Ubuntu but likely will work for other OS as well, it’s just not been fully tested. Please give that a try!