[Yocto OpenEmbedded OS] NVIDIA Container Toolkit for Podman container runtime

Please provide the following info (tick the boxes after creating this topic):
Software Version
DRIVE OS 6.0.10.0
DRIVE OS 6.0.8.1
DRIVE OS 6.0.6
DRIVE OS 6.0.5
DRIVE OS 6.0.4 (rev. 1)
DRIVE OS 6.0.4 SDK
other

Target Operating System
Linux
QNX
other Yocto OpenEmbeded

Hardware Platform
DRIVE AGX Orin Developer Kit (940-63710-0010-300)
DRIVE AGX Orin Developer Kit (940-63710-0010-200)
DRIVE AGX Orin Developer Kit (940-63710-0010-100)
DRIVE AGX Orin Developer Kit (940-63710-0010-D00)
DRIVE AGX Orin Developer Kit (940-63710-0010-C00)
DRIVE AGX Orin Developer Kit (not sure its number)
other p3710-10-s05

SDK Manager Version
2.1.0
other don’t know

Host Machine Version
native Ubuntu Linux 20.04 Host installed with SDK Manager
[ X ] native Ubuntu Linux 20.04 Host installed with DRIVE OS Docker Containers
native Ubuntu Linux 18.04 Host installed with DRIVE OS Docker Containers
other

Issue Description
Hi NVIDIA friends,

We’re from Amazon, and we’re working on building an edge computing system using the ORIN Drive box to run ML-based workloads within Podman containers.

We successfully completed a POC (reference)with an Ubuntu-based ORIN Drive Box, where we managed to run sample CUDA applications inside a --privileged Podman container. We verified that these CUDA samples in isolated containers could access the SOC GPU on the host by using the NVIDIA Container Toolkit to mount the NVIDIA drivers from the host into the Podman container.

However, we’ve encountered a challenge: our production ORIN Drive Box will run a custom Yocto OpenEmbedded OS rather than Ubuntu.

According to this NVIDIA-container-toolkit documentation, NVIDIA Container Toolkit versions 1.12 and above are required to patch the container runtime for creating GPU-accelerated Podman containers. Unfortunately, the available documentation starts from version 1.14+, and the latest NVIDIA Container Toolkit version available in the OpenEmbedded meta-tegra layer is v1.11.

This is problematic for us, as we need GPU-accelerated Podman containers to productionize our Yocto OS-based edge system. Is there a way to enable GPU acceleration in Podman containers given these constraints?

Any insights or guidance would be greatly appreciated.

Thank you!

Dear @xinronch,
As clarified in previous topic, docker feature is just provided for experimenting and not for production.
Given the version limitation, I doubt if it can be supported.
I will check internally and get back to you.

So, based on my previous POC and our prior correspondence in the other thread, we confirmed that running a GPU-accelerated Podman container is an experimental feature, achievable only by installing the NVIDIA Container Toolkit and running the container with the --privileged mode.

While this approach does not align with security best practices, we are willing to proceed with it in the short term.

However, if it turns out that running a GPU-accelerated Podman container is not feasible on a custom-built Yocto OpenEmbedded operating system (the newest available nvidia-container-toolkit version in the OpenEmbedded meta-tegra layer is 1.11, while Podman requires version 1.12 or later), this would be a significant blocker for us.

Could you please check with your internal team regarding this?

Thank you very much for your help

Hi NVIDIA friends,
any update on this?

Dear @xinronch ,
It is out of POR scope and can’t not be supported.