Enabling GPUs in the Container Runtime Ecosystem

jwitsoe · June 1, 2018, 7:08am

Originally published at: Enabling GPUs in the Container Runtime Ecosystem | NVIDIA Technical Blog

NVIDIA uses containers to develop, test, benchmark, and deploy deep learning (DL) frameworks and HPC applications. We wrote about building and deploying GPU containers at scale using NVIDIA-Docker roughly two years ago. Since then, NVIDIA-Docker has been downloaded close to 2 million times. A variety of customers used NVIDIA-Docker to containerize and run GPU accelerated workloads. NVIDIA…

anon92906312 · June 1, 2018, 7:37pm

I wish you'd talk more about Singularity.

anon37864377 · June 21, 2018, 6:27pm

Gostei muito.

anon50093331 · September 28, 2018, 12:04pm

this is nice! but, the gpu's needed as documented elsewhere on the Nvidia site are

NVIDIA TITAN V (Volta)
NVIDIA TITAN X (Pascal)
NVIDIA TITAN Xp (Pascal)
NVIDIA Quadro GV100 (Volta)
NVIDIA Quadro GP100 (Pascal)
NVIDIA Quadro P6000 (Pascal)

the web site https://docs.nvidia.com/ngc... has no information about the RTX cards as yet. Your documentation is all over the place, not up to date, and seems to confuse issues for the sake of clearing inventory...

I want to run my cloud hybrid or on premises with commodity (ish) hardware, so my old titan, or new gtx1080ti do not get a look in irrespective of their capabilities... I can try and flash the bios so that my 1080ti looks like something acceptable, but reliable sources tell me that Nvidia has precluded this with watchdogs that will brick my new card... Nice. So it looks as if I should try for second hand titan x (pascal) or titan Xp cards.Trouble with this is that I expect the next round of iterations to jerk the rug out from under my feet, again, forcing me to submit to the scrutiny and rental costs of the cloud, or go for the extremely expensive route. Or go with intel/amd... It may be I just freeze in time with a titan x (pascal) or two, after the RTX2080 and 2080Ti (and god knows what other jack in the boxes are about to be foist on me) depress the market for lesser cards. This does make it very difficult for sole operators like me, who are being frozen out of developing their ideas inexpensively. And I know that if I go cloud/hybrid with some of the juicier ones, then it won't be too long before some bright eyed ivy leaguer launches another billion dollar company... Thanks Nvidia!

anon6964747 · October 6, 2018, 12:03am

This works fine for me using a GTX 1080Ti?

anon50093331 · October 6, 2018, 1:58am

wow! really? you can spin up on premises (aka on your home box) GPU cloud instances on a GTX1080Ti? Can you detail what you did, please? This is very good news, will save me a LOT of stuffing around. CHEERS :))

anon6964747 · October 6, 2018, 2:49am

I'm not sure if this is exactly the same as GPU cloud. But you can definitely spin up nvidia-docker containers on a home/in premisise machine with GTX 1080Ti's. If you're not familiar with docker, a docker contain is basically a small headless virtual machine with low overhead and a scripted build process so repeatable.

I installed Ubuntu 18.04 - I tried Debian and Fedora but it's far less painful with Ubuntu.

You then need to install cuda 10.0 with the driver which ships with it 410.48, if you try and download the driver by itself you get 396.xx which doesn't support cuda 10.0 (you may also need to disable the open source nouveau driver). I first disabled the nouveau driver by following https://linuxconfig.org/how... (I'm not sure if this is required or not)

Then installed cuda 10.0

$ sudo apt-get install build-essential dkms
$ sudo apt-get install freeglut3 freeglut3-dev libxi-dev libxmu-dev

Now download the deb file installer and following the instructions here:

https://developer.nvidia.co...

$ sudo dpkg -i cuda-repo-ubuntu1804_10.0.130-1_amd64.deb
$ sudo apt-key adv --fetch-keys https://developer.download....
$ sudo apt-get update
$ sudo apt-get install cuda
$ reboot

Then install docker-ce by following the instructions here:

https://docs.docker.com/ins...

Then the nvidia docker runtime following the instructions here:

https://github.com/nvidia/n...

Now you should be able to bring up a nvidia-docker container

# Test nvidia-smi with the cuda 9.0
$ docker run --runtime=nvidia --rm nvidia/cuda:9.0-base nvidia-smi
# Test nvidia-smi with the cuda 9.0
$ docker run --runtime=nvidia --rm nvidia/cuda:10.0-base nvidia-smi

There's a ton of different base images you can use i.e. https://hub.docker.com/r/nv...

I've been using nvidia/cuda:9.0-cudnn7-devel-ubuntu16.04 as it matches the requirements for tensorflow-gpu for example

anon50093331 · October 7, 2018, 12:36am

hey thanks for this, I am about to embark on another build quest, so it will be interesting to compare your instructions with what I have already done. I am still not holding my breath on the NVIDIA GPU cloud images working though...

anon27271155 · October 15, 2018, 5:25pm

What about windows?
Actually windows is much needed.
I need it to run directx 12.

anon6612507 · April 6, 2019, 7:37pm

I have a question, if I have a server with 4 tesla M10, can I only launch 4 containers?

anon58564609 · March 3, 2020, 7:34pm

Docker can run on both Windows and macOS operating systems. This is enabled by the Docker architectures

Keelung · February 15, 2022, 7:47am

As described in this blog:

cuda-drivers package may not work on Ubuntu 18.04 LTS systems

I want to know what’s and why exactly, since we’re using Ubuntu 18.04.6

This blog was wrote in Jun 01, 2018, and no update history shown. So it’s still proper today?
Is there any update needed?

Thanks!

P_Ramarao · February 23, 2022, 10:40pm

Hi @Keelung

Please refer to the official documentation on installing drivers on supported Linux distributions:

NVIDIA Driver Installation Quickstart Guide :: NVIDIA Tesla Documentation

Hope that answers your question. Thanks

Topic		Replies	Views
Running Docker Containers Directly on NVIDIA DRIVE AGX Orin Technical Blog	17	1761	April 25, 2023
Running ai docker containers on jetson orin nano with gpu support Jetson Orin Nano docker , generative_ai	7	96	June 18, 2025
JetPack 6.3 containerd and kubernetes Jetson AGX Orin nvbugs , containers	12	995	August 22, 2024
Nvida Container Toolkit: Failed to initialize NVML: Unknown Error Linux	8	19818	June 29, 2025
Rootless Podman Container - CUDA Operation Not Supported - Error Code 801 DRIVE AGX Orin General driveos-cuda	11	657	October 10, 2024
Normal user cannot use cuda device in L4T-36.2 docker Jetson AGX Orin cuda , jetson-inference	10	146	December 12, 2024
Cann't use gpu resources in containerd with k8s and orin nano Jetson Orin Nano gpu-computing	3	320	August 5, 2024
Nvidia docker runtime does not seem to work with docker compose Docker and NVIDIA Docker ubuntu	11	4548	October 30, 2024
Error using docker & nvidia-container-toolkit on Pegasus DRIVE AGX Xavier General drive-misc	5	1228	October 12, 2021
(HTTP code 500) server error - could not select device driver “” with capabilities: [[gpu]] Docker and NVIDIA Docker docker , ai-training , ai , docker-machine-learning , ai-model-training	0	163	March 6, 2025

Enabling GPUs in the Container Runtime Ecosystem

Related topics