Running a X Server with Nvidia GPU in Docker

Hello, I need some help and hope this is the right place for my question.

We need to run some performance critical tests within our CI pipeline. Therefore we use a special Gitlab CI Runner with an attached Nvidia GPU (Tesla T4). The CI Job runs in Kubernetes on Azure. The setup with NVIDIA Container Toolkit is already done.

Current situation
We use a Cypress docker image as base and set the required environment variables for the container runtime:

FROM cypress/included:8.3.1

ENV NVIDIA_DRIVER_CAPABILITIES compute,utility,display,graphics

The nvidia-smi command shows the following output:

| NVIDIA-SMI 470.57.02    Driver Version: 470.57.02    CUDA Version: 11.4     |
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla T4            Off  | 00000001:00:00.0 Off |                  Off |
| N/A   25C    P8     8W /  70W |      0MiB / 16127MiB |      0%      Default |
|                               |                      |                  N/A |
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|  No running processes found                                                 |

Cypress needs an X Server when running on Linux. By default it uses xvfb, but this will not use the GPU. Thats why I try to set up Xorg, which fails with the following error:

Fatal server error: (EE) no screens found(EE)

I created a configuration at /etc/X11/xorg.conf and tried some options, but had no success so far:

  • Option “AllowEmptyInitialConfiguration”
  • Option “IgnoreEDID”
  • Option “UseDisplayDevice” “none”

The nvidia-xconfig seems to be deprecated and can no longer be installed. It states: “This tool is deprecated. The NVIDIA drivers now automatically integrate with the Xorg Xserver configuration. Creating an xorg.conf is no longer needed for normal setups.”. But without this config, Xorg will show the above error message as well.

This is the xorg.conf generated by Xorg :0 -configure

Section "ServerLayout"
	Identifier     " Configured"
	Screen      0  "Screen0" 0 0
	InputDevice    "Mouse0" "CorePointer"
	InputDevice    "Keyboard0" "CoreKeyboard"

Section "Files"
	ModulePath   "/usr/lib/xorg/modules"
	FontPath     "/usr/share/fonts/X11/misc"
	FontPath     "/usr/share/fonts/X11/cyrillic"
	FontPath     "/usr/share/fonts/X11/100dpi/:unscaled"
	FontPath     "/usr/share/fonts/X11/75dpi/:unscaled"
	FontPath     "/usr/share/fonts/X11/Type1"
	FontPath     "/usr/share/fonts/X11/100dpi"
	FontPath     "/usr/share/fonts/X11/75dpi"
	FontPath     "built-ins"

Section "Module"
	Load  "glx"

Section "InputDevice"
	Identifier  "Keyboard0"
	Driver      "kbd"

Section "InputDevice"
	Identifier  "Mouse0"
	Driver      "mouse"
	Option	    "Protocol" "auto"
	Option	    "Device" "/dev/input/mice"
	Option	    "ZAxisMapping" "4 5 6 7"

Section "Monitor"
	Identifier   "Monitor0"
	VendorName   "Monitor Vendor"
	ModelName    "Monitor Model"

Section "Device"
        ### Available Driver options are:-
        ### Values: <i>: integer, <f>: float, <bool>: "True"/"False",
        ### <string>: "String", <freq>: "<f> Hz/kHz/MHz",
        ### <percent>: "<f>%"
        ### [arg]: arg optional
        #Option     "SWcursor"           	# [<bool>]
        #Option     "kmsdev"             	# <str>
        #Option     "ShadowFB"           	# [<bool>]
        #Option     "AccelMethod"        	# <str>
        #Option     "PageFlip"           	# [<bool>]
        #Option     "ZaphodHeads"        	# <str>
        #Option     "DoubleShadow"       	# [<bool>]
	Identifier  "Card0"
	Driver      "modesetting"
	BusID       "PCI:0:0:0"

Section "Screen"
	Identifier "Screen0"
	Device     "Card0"
	Monitor    "Monitor0"
	SubSection "Display"
		Viewport   0 0
		Depth     1
	SubSection "Display"
		Viewport   0 0
		Depth     4
	SubSection "Display"
		Viewport   0 0
		Depth     8
	SubSection "Display"
		Viewport   0 0
		Depth     15
	SubSection "Display"
		Viewport   0 0
		Depth     16
	SubSection "Display"
		Viewport   0 0
		Depth     24

I am not sure why the driver is set to modesetting, but changing it to nvidia results in Failed to load module "nvidia" (module does not exist, 0).

Do you have any hints how I can get Xorg (or any other X Server) running within a container while using the Nvidia Tesla T4 GPU, but without a display?