I am running an EKS (Amazon Elastic Kubernetes Service) cluster with g4dn.xlarge
instances that include a Tesla T4 GPU. Within my container environment, I have installed the NVIDIA GPU operator and confirmed that the GPU (nvidia-smi
) is recognized and functional. However, when attempting to use VirtualGL (vglrun
) to run graphical applications such as Firefox with GPU acceleration, I encounter the following error:
vglrun -d "/dev/nvidia0" firefox
[GFX1-]: glxtest: ManageChildProcess failed
[GFX1-]: No GPUs detected via PCI
Details:
- EKS Instance Type:
g4dn.xlarge
- GPU: Tesla T4
- NVIDIA gpu-operator helm chart: `v23.9.2
- Container Environment: Kubernetes with NVIDIA GPU operator
- Command Used:
vglrun -d "/dev/nvidia0" firefox
Issue:
- VirtualGL (
vglrun
) fails to detect GPUs via PCI when launching applications like Firefox, preventing GPU acceleration.
Questions:
- How can I troubleshoot and resolve the issue of VirtualGL not detecting GPUs within my container environment?
- Are there additional configurations or dependencies required to enable GPU acceleration with VirtualGL on EKS using the NVIDIA GPU operator?
Additional Information:
- Output of
nvidia-smi
within the container confirms GPU presence and functionality.
@ubuntu-fk5a8-a3b8006etmtlb:~$ nvidia-smi
Wed May 1 08:45:59 2024
±----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15 Driver Version: 550.54.15 CUDA Version: 12.4 |
|-----------------------------------------±-----------------------±---------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 Tesla T4 On | 00000000:00:1E.0 Off | 0 |
| N/A 25C P8 14W / 70W | 0MiB / 15360MiB | 0% Default |
| | | N/A |
±----------------------------------------±-----------------------±---------------------+
±----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
±----------------------------------------------------------------------------------------+
- Any insights or recommendations for setting up VirtualGL with Kubernetes and NVIDIA GPU operator would be greatly appreciated.
Thank you for your assistance!