Vulkan GPU Access in AWS GitHub Actions Runner with NVIDIA L4 - Setup Guidance Needed

mrobertseidowsky-gopro · October 30, 2025, 1:59pm

Hello NVIDIA Community,

I successfully got Vulkan working with NVIDIA L4 GPU in my test environment, but I need guidance on the proper setup for production use in GitHub Actions runners.

Working Test Environment

I managed to get Vulkan + GPU working in a container on an EC2 GPU instance using this approach:

Host setup (EC2):

Installed NVIDIA driver (580.95.05)
Installed NVIDIA Container Toolkit
Followed official NVIDIA docs:
- Driver Installation Guide
- Container Toolkit Install Guide

Container setup:

FROM ubuntu:22.04

#Install nvidia-utils-580-server

#Install nvidia-container-toolkit 1.18.0-1

#Install Vulkan SDK 1.4.328.1

#Install Vulkan dependencies (libxcb, libwayland, etc.)

Result: vulkaninfo successfully detects the NVIDIA L4 GPU with Vulkan 1.4.312 support ✅

Current Blocker

I need to reproduce this setup on AWS self-hosted GitHub Actions runners with these specs:

Pattern: github-actions-runner-xlarge-gpu-*
Labels: self-hosted, x64, stable, infra-eks-general, us-west-2, dind, xlarge, gpu
GPU: NVIDIA L4 (driver 535.230.02)
Platform: EKS with Bottlerocket OS
Use case: CI workflow tests that stress the GPU via Vulkan backend

Current issue: Vulkan cannot see the GPU device in the runner, even though nvidia-smi works fine.

Questions

Driver version mismatch? My test env uses driver 580.95.05, but the runner has 535.230.02. Could this cause Vulkan detection issues?
Missing libraries in container? Are there specific NVIDIA Vulkan libraries that need to be mounted from the host that I might be missing?
Bottlerocket OS specifics? Does Bottlerocket require special configuration for NVIDIA Container Toolkit or Vulkan ICD mounting?
Best practices for EKS runners? What’s the recommended way to enable Vulkan GPU access in containerized GitHub Actions runners on EKS?

What I’ve Tried

Installing nvidia-utils-580-server in the container
Installing NVIDIA Container Toolkit inside the container
Installing Vulkan SDK 1.4.328.1
Setting proper environment variables (VULKAN_SDK, VK_ADD_LAYER_PATH, etc.)

Request

Could someone provide guidance on:

Proper host/container setup for Vulkan on EKS Bottlerocket runners
Whether the driver version mismatch could be the root cause
Any missing libraries or configuration needed for Vulkan ICD detection

I’m happy to provide additional logs, vulkaninfo output, or Docker/runner configuration details if needed.

Thanks in advance for your help!

vulkan aws #eks #githubaction #container

Topic		Replies	Views
Unable to Detect NVIDIA GPU with VirtualGL in EKS Cluster OpenGL opengl , kubernetes , nvidia-smi	0	378	May 1, 2024
GPU not detected by Vulkan (NVIDIA works) in Docker Isaac Sim docker , vulkan , nvidia-smi , isaac-sim-v5-0-0	9	355	August 22, 2025
NVIDIA A100 support vulkan? NVIDIA Virtual GPU Technology a100	6	3713	July 1, 2025
Docker images for Vulkan SDK development with NVIDIA CUDA runtime support. Vulkan	1	2556	July 14, 2019
Vulkan not working with A100-SXM4-80GB GPU in VM, vkCreateInstance failed with ERROR_INCOMPATIBLE_DRIVER NVIDIA Virtual GPU Drivers vulkan	1	1225	March 7, 2024
Vulkan support on enterprise hardware Vulkan	1	1007	February 20, 2025
Linux / GTX 675MX Vulkan	1	1522	April 7, 2016
Vulkan cannot detect GPU device Linux	0	771	October 30, 2024
Vulkan Problem Jetson AGX Orin vulkan	5	490	June 18, 2024
Running Replicator 1.6.3 from inside of kubernetes (microk8s nor minikube) Synthetic Data Generation (SDG) docker , vulkan , synthetic-data , kubernetes , omniverse	7	866	April 2, 2024