Subject: TensorFlow GPU Failure on RTX 5090 Laptop GPU in WSL2/Docker with Latest NVIDIA Drivers (576.xx)

Environment Summary:

  • Operating System: Windows 11 Pro (Up-to-date)
  • WSL Version: WSL2
  • WSL Distribution: Ubuntu 24.04.2 LTS (Kernel 5.15.167.4-microsoft-standard-WSL2)
  • GPU: NVIDIA GeForce RTX 5090 Laptop GPU (Blackwell Architecture)
  • NVIDIA Driver: 576.28 Game Ready (Clean Install - also tested 576.02 Studio with same results)
  • Docker: Docker Desktop (Latest version) with WSL2 backend enabled and Ubuntu integration active.
  • ML Framework Setup Attempted: TensorFlow within Docker containers via WSL2.
  • Python Environment Management (inside WSL): Miniconda3

Goal:

  • To run TensorFlow projects with NVIDIA GPU acceleration using Docker containers within the WSL2 environment on Windows 11.

Problem Description:

Attempts to utilize the RTX 5090 Laptop GPU with TensorFlow inside Docker containers running under WSL2 have consistently failed, despite successful GPU detection and utilization by PyTorch in the same WSL2 environment (outside Docker). Two primary approaches using pre-built TensorFlow containers were tested with the latest NVIDIA drivers (576.xx):

  1. Using NVIDIA NGC TensorFlow Container (nvcr.io/nvidia/tensorflow:25.02-tf2-py3 as base image):
  • This container release is noted by NVIDIA as being optimized for Blackwell architectures starting from 25.01.
  • Result: When running a container based on this image (docker run --gpus all ...), TensorFlow detects the physical GPU hardware but fails to initialize it, logging: WARNING: Detected NVIDIA GeForce RTX 5090 Laptop GPU, which is not yet supported in this version of the container ERROR: No supported GPU(s) detected to run this container
  • Subsequent checks (tf.config.list_physical_devices('GPU')) confirm TensorFlow sees 0 available GPUs.
  1. Using Official TensorFlow Nightly Container (tensorflow/tensorflow:nightly-gpu as base image):
  • This image (which reported TF 2.12.0 internally during tests) fails at an earlier stage.
  • Result: When running a container based on this image (docker run --gpus all ...), TensorFlow fails to link with the underlying CUDA driver provided by the host/runtime, logging: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used. W tensorflow/core/common_runtime/gpu/gpu_device.cc:1956] Cannot dlopen some GPU libraries... Skipping registering GPU devices...
  • Subsequent checks (tf.config.list_physical_devices('GPU')) confirm TensorFlow sees 0 available GPUs.

Working Comparison:

  • PyTorch: Using a standard Conda environment directly within WSL2 (Ubuntu), PyTorch (2.7.0+cu128 installed via pip with --extra-index-url .../whl/cu128) successfully detects and utilizes the RTX 5090 Laptop GPU with the same NVIDIA driver (576.28). This indicates that the basic WSL2 GPU passthrough mechanism is functional with this driver/hardware.

Troubleshooting Steps Attempted:

  • Verified Docker Desktop WSL2 integration is correctly configured.
  • Used --gpus all flag for docker run.
  • Updated to the latest available NVIDIA Game Ready Driver (576.28).
  • Attempted to downgrade NVIDIA driver to the earliest version listed for RTX 5090 Laptop (572.83), but the installer reported “incompatible with your device”, preventing downgrade. Cannot test older drivers.
  • Confirmed via GitHub Issue #89272 that others face similar PTX JIT compilation errors (CUDA_ERROR_INVALID_PTX, CUDA_ERROR_INVALID_HANDLE) with RTX 5090 + TF + WSL2, suggesting issues with pre-built binaries for this architecture. Successful workarounds in that thread involved building TF from source or using the NGC 25.02 container with an older (572.xx) driver (which is not installable on this specific hardware).
  • Noted the discontinuation of NVIDIA Optimized TensorFlow containers after release 25.02.

Core Issue Summary:

Current pre-built TensorFlow solutions available via standard distribution channels (NVIDIA NGC containers up to 25.02, official TensorFlow Docker Hub nightly-gpu images, and likely pip install tensorflow[and-cuda] > 2.10) appear incompatible with the combination of the RTX 5090 Laptop GPU and the latest required NVIDIA drivers (576.xx) when run within a Docker container in WSL2. The failure manifests either as the container explicitly not supporting the detected hardware (NGC 25.02) or as a failure to link with the host CUDA drivers (TF nightly-gpu).

Request:

Seeking guidance on a compatible configuration or pre-built solution (container or pip package) for utilizing TensorFlow with GPU acceleration on an RTX 5090 Laptop GPU using current NVIDIA drivers within a WSL2/Docker environment. Alternatively, confirmation if building TensorFlow from source is currently the only viable path for this hardware configuration.

The Ubuntu version 24.04 may not be supported. Consider using Ubuntu 22.04, instead of 24.04.