Unable to Detect NVIDIA GPU with VirtualGL in EKS Cluster

I am running an EKS (Amazon Elastic Kubernetes Service) cluster with g4dn.xlarge instances that include a Tesla T4 GPU. Within my container environment, I have installed the NVIDIA GPU operator and confirmed that the GPU (nvidia-smi) is recognized and functional. However, when attempting to use VirtualGL (vglrun) to run graphical applications such as Firefox with GPU acceleration, I encounter the following error:

vglrun -d "/dev/nvidia0" firefox
[GFX1-]: glxtest: ManageChildProcess failed
[GFX1-]: No GPUs detected via PCI

Details:

  • EKS Instance Type: g4dn.xlarge
  • GPU: Tesla T4
  • NVIDIA gpu-operator helm chart: `v23.9.2
  • Container Environment: Kubernetes with NVIDIA GPU operator
  • Command Used: vglrun -d "/dev/nvidia0" firefox

Issue:

  • VirtualGL (vglrun) fails to detect GPUs via PCI when launching applications like Firefox, preventing GPU acceleration.

Questions:

  1. How can I troubleshoot and resolve the issue of VirtualGL not detecting GPUs within my container environment?
  2. Are there additional configurations or dependencies required to enable GPU acceleration with VirtualGL on EKS using the NVIDIA GPU operator?

Additional Information:

  • Output of nvidia-smi within the container confirms GPU presence and functionality.
    @ubuntu-fk5a8-a3b8006etmtlb:~$ nvidia-smi
    Wed May 1 08:45:59 2024
    ±----------------------------------------------------------------------------------------+
    | NVIDIA-SMI 550.54.15 Driver Version: 550.54.15 CUDA Version: 12.4 |
    |-----------------------------------------±-----------------------±---------------------+
    | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
    | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
    | | | MIG M. |
    |=========================================+========================+======================|
    | 0 Tesla T4 On | 00000000:00:1E.0 Off | 0 |
    | N/A 25C P8 14W / 70W | 0MiB / 15360MiB | 0% Default |
    | | | N/A |
    ±----------------------------------------±-----------------------±---------------------+

±----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
±----------------------------------------------------------------------------------------+

  • Any insights or recommendations for setting up VirtualGL with Kubernetes and NVIDIA GPU operator would be greatly appreciated.

Thank you for your assistance!