Running a X Server with Nvidia GPU in Docker

Hello, I need some help and hope this is the right place for my question.

Context
We need to run some performance critical tests within our CI pipeline. Therefore we use a special Gitlab CI Runner with an attached Nvidia GPU (Tesla T4). The CI Job runs in Kubernetes on Azure. The setup with NVIDIA Container Toolkit is already done.

Current situation
We use a Cypress docker image as base and set the required environment variables for the container runtime:

FROM cypress/included:8.3.1

ENV NVIDIA_VISIBLE_DEVICES all
ENV NVIDIA_DRIVER_CAPABILITIES compute,utility,display,graphics

The nvidia-smi command shows the following output:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.57.02    Driver Version: 470.57.02    CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla T4            Off  | 00000001:00:00.0 Off |                  Off |
| N/A   25C    P8     8W /  70W |      0MiB / 16127MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

Problem
Cypress needs an X Server when running on Linux. By default it uses xvfb, but this will not use the GPU. Thats why I try to set up Xorg, which fails with the following error:

Fatal server error: (EE) no screens found(EE)

I created a configuration at /etc/X11/xorg.conf and tried some options, but had no success so far:

  • Option “AllowEmptyInitialConfiguration”
  • Option “IgnoreEDID”
  • Option “UseDisplayDevice” “none”

The nvidia-xconfig seems to be deprecated and can no longer be installed. It states: “This tool is deprecated. The NVIDIA drivers now automatically integrate with the Xorg Xserver configuration. Creating an xorg.conf is no longer needed for normal setups.”. But without this config, Xorg will show the above error message as well.

This is the xorg.conf generated by Xorg :0 -configure

Section "ServerLayout"
	Identifier     "X.org Configured"
	Screen      0  "Screen0" 0 0
	InputDevice    "Mouse0" "CorePointer"
	InputDevice    "Keyboard0" "CoreKeyboard"
EndSection

Section "Files"
	ModulePath   "/usr/lib/xorg/modules"
	FontPath     "/usr/share/fonts/X11/misc"
	FontPath     "/usr/share/fonts/X11/cyrillic"
	FontPath     "/usr/share/fonts/X11/100dpi/:unscaled"
	FontPath     "/usr/share/fonts/X11/75dpi/:unscaled"
	FontPath     "/usr/share/fonts/X11/Type1"
	FontPath     "/usr/share/fonts/X11/100dpi"
	FontPath     "/usr/share/fonts/X11/75dpi"
	FontPath     "built-ins"
EndSection

Section "Module"
	Load  "glx"
EndSection

Section "InputDevice"
	Identifier  "Keyboard0"
	Driver      "kbd"
EndSection

Section "InputDevice"
	Identifier  "Mouse0"
	Driver      "mouse"
	Option	    "Protocol" "auto"
	Option	    "Device" "/dev/input/mice"
	Option	    "ZAxisMapping" "4 5 6 7"
EndSection

Section "Monitor"
	Identifier   "Monitor0"
	VendorName   "Monitor Vendor"
	ModelName    "Monitor Model"
EndSection

Section "Device"
        ### Available Driver options are:-
        ### Values: <i>: integer, <f>: float, <bool>: "True"/"False",
        ### <string>: "String", <freq>: "<f> Hz/kHz/MHz",
        ### <percent>: "<f>%"
        ### [arg]: arg optional
        #Option     "SWcursor"           	# [<bool>]
        #Option     "kmsdev"             	# <str>
        #Option     "ShadowFB"           	# [<bool>]
        #Option     "AccelMethod"        	# <str>
        #Option     "PageFlip"           	# [<bool>]
        #Option     "ZaphodHeads"        	# <str>
        #Option     "DoubleShadow"       	# [<bool>]
	Identifier  "Card0"
	Driver      "modesetting"
	BusID       "PCI:0:0:0"
EndSection

Section "Screen"
	Identifier "Screen0"
	Device     "Card0"
	Monitor    "Monitor0"
	SubSection "Display"
		Viewport   0 0
		Depth     1
	EndSubSection
	SubSection "Display"
		Viewport   0 0
		Depth     4
	EndSubSection
	SubSection "Display"
		Viewport   0 0
		Depth     8
	EndSubSection
	SubSection "Display"
		Viewport   0 0
		Depth     15
	EndSubSection
	SubSection "Display"
		Viewport   0 0
		Depth     16
	EndSubSection
	SubSection "Display"
		Viewport   0 0
		Depth     24
	EndSubSection
EndSection

I am not sure why the driver is set to modesetting, but changing it to nvidia results in Failed to load module "nvidia" (module does not exist, 0).

Question
Do you have any hints how I can get Xorg (or any other X Server) running within a container while using the Nvidia Tesla T4 GPU, but without a display?

1 Like

I think it’s 2 years late, but I made it work when you asked this question: