Running a X Server with Nvidia GPU in Docker

nvidia-chris-159 · October 1, 2021, 10:48am

Hello, I need some help and hope this is the right place for my question.

Context
We need to run some performance critical tests within our CI pipeline. Therefore we use a special Gitlab CI Runner with an attached Nvidia GPU (Tesla T4). The CI Job runs in Kubernetes on Azure. The setup with NVIDIA Container Toolkit is already done.

Current situation
We use a Cypress docker image as base and set the required environment variables for the container runtime:

FROM cypress/included:8.3.1

ENV NVIDIA_VISIBLE_DEVICES all
ENV NVIDIA_DRIVER_CAPABILITIES compute,utility,display,graphics

The nvidia-smi command shows the following output:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.57.02    Driver Version: 470.57.02    CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla T4            Off  | 00000001:00:00.0 Off |                  Off |
| N/A   25C    P8     8W /  70W |      0MiB / 16127MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

Problem
Cypress needs an X Server when running on Linux. By default it uses xvfb, but this will not use the GPU. Thats why I try to set up Xorg, which fails with the following error:

Fatal server error: (EE) no screens found(EE)

I created a configuration at /etc/X11/xorg.conf and tried some options, but had no success so far:

Option “AllowEmptyInitialConfiguration”
Option “IgnoreEDID”
Option “UseDisplayDevice” “none”

The nvidia-xconfig seems to be deprecated and can no longer be installed. It states: “This tool is deprecated. The NVIDIA drivers now automatically integrate with the Xorg Xserver configuration. Creating an xorg.conf is no longer needed for normal setups.”. But without this config, Xorg will show the above error message as well.

This is the xorg.conf generated by Xorg :0 -configure

Section "ServerLayout"
	Identifier     "X.org Configured"
	Screen      0  "Screen0" 0 0
	InputDevice    "Mouse0" "CorePointer"
	InputDevice    "Keyboard0" "CoreKeyboard"
EndSection

Section "Files"
	ModulePath   "/usr/lib/xorg/modules"
	FontPath     "/usr/share/fonts/X11/misc"
	FontPath     "/usr/share/fonts/X11/cyrillic"
	FontPath     "/usr/share/fonts/X11/100dpi/:unscaled"
	FontPath     "/usr/share/fonts/X11/75dpi/:unscaled"
	FontPath     "/usr/share/fonts/X11/Type1"
	FontPath     "/usr/share/fonts/X11/100dpi"
	FontPath     "/usr/share/fonts/X11/75dpi"
	FontPath     "built-ins"
EndSection

Section "Module"
	Load  "glx"
EndSection

Section "InputDevice"
	Identifier  "Keyboard0"
	Driver      "kbd"
EndSection

Section "InputDevice"
	Identifier  "Mouse0"
	Driver      "mouse"
	Option	    "Protocol" "auto"
	Option	    "Device" "/dev/input/mice"
	Option	    "ZAxisMapping" "4 5 6 7"
EndSection

Section "Monitor"
	Identifier   "Monitor0"
	VendorName   "Monitor Vendor"
	ModelName    "Monitor Model"
EndSection

Section "Device"
        ### Available Driver options are:-
        ### Values: <i>: integer, <f>: float, <bool>: "True"/"False",
        ### <string>: "String", <freq>: "<f> Hz/kHz/MHz",
        ### <percent>: "<f>%"
        ### [arg]: arg optional
        #Option     "SWcursor"           	# [<bool>]
        #Option     "kmsdev"             	# <str>
        #Option     "ShadowFB"           	# [<bool>]
        #Option     "AccelMethod"        	# <str>
        #Option     "PageFlip"           	# [<bool>]
        #Option     "ZaphodHeads"        	# <str>
        #Option     "DoubleShadow"       	# [<bool>]
	Identifier  "Card0"
	Driver      "modesetting"
	BusID       "PCI:0:0:0"
EndSection

Section "Screen"
	Identifier "Screen0"
	Device     "Card0"
	Monitor    "Monitor0"
	SubSection "Display"
		Viewport   0 0
		Depth     1
	EndSubSection
	SubSection "Display"
		Viewport   0 0
		Depth     4
	EndSubSection
	SubSection "Display"
		Viewport   0 0
		Depth     8
	EndSubSection
	SubSection "Display"
		Viewport   0 0
		Depth     15
	EndSubSection
	SubSection "Display"
		Viewport   0 0
		Depth     16
	EndSubSection
	SubSection "Display"
		Viewport   0 0
		Depth     24
	EndSubSection
EndSection

I am not sure why the driver is set to modesetting, but changing it to nvidia results in Failed to load module "nvidia" (module does not exist, 0).

Question
Do you have any hints how I can get Xorg (or any other X Server) running within a container while using the Nvidia Tesla T4 GPU, but without a display?

ehfd · November 2, 2023, 2:06pm

I think it’s 2 years late, but I made it work when you asked this question:

Topic		Replies	Views
Qubes OS - Applications do not open Linux xendesktop	15	2193	February 16, 2022
Cant open nvidia-setting ubuntu 18.04 , vgpu Nvidia Tesla M10 Linux	6	1852	July 16, 2019
Dual GPU Intel-Nvidia / Prime Render Offloading / Ubuntu 20.04 -- does not offload Linux ubuntu	9	6172	January 27, 2021
XServer on headless Root Server with NVIDIA GPU - permission error Linux	3	3699	January 31, 2022
nvidia-docker inside Kubernetes - Failed to initialize NVML: Unknown Error CUDA Setup and Installation	3	4054	January 9, 2022
Enabling GPUs in the Container Runtime Ecosystem Technical Blog	12	682	February 23, 2022
Nvidia driver may not work with X server Linux	15	2355	April 22, 2022
xorg.conf for NVIDIA Driver + X Windows on a server with 2 TeslaK20 GPUs + 1 MatroxG200 Graphics Card CUDA Setup and Installation	0	1786	September 21, 2016
Running an application on Nvidia card Linux	6	6891	May 25, 2020
Docker and nvidia-smi not working with clean install on Driver 470.14 and Insider Preview (Build 21343) Ubuntu 20.04 CUDA on Windows Subsystem for Linux	3	5513	April 17, 2021

Running a X Server with Nvidia GPU in Docker

Related topics