Ray Tracing support unavailable on Windows Server 2022 with 3 NVidia A40 GPUs

liordino · September 14, 2023, 5:25pm

Hi all,

I’m trying to troubleshot the setup of a new rendering server at work. It have 3 NVidia A40 GPU’s and no matter what I’ve done so far all of them shows ray-tracing support as unavailable. I checked in the following ways:

GPU-Z shows the ray tracing box unticked;

gpu-z388×539 23.7 KB
When using D3D12 CheckFeatureSupport function the RaytracingTier returns D3D12_RAYTRACING_TIER_NOT_SUPPORTED;

Some context:

Driver’s are updated using the GeForce Experience app. Currently on version 537.13;
Here’s the output from nvidia-smi

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 537.13                 Driver Version: 537.13       CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                     TCC/WDDM  | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA A40                   TCC   | 00000000:2B:00.0 Off |                    0 |
|  0%   28C    P8              11W / 300W |      1MiB / 46068MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
|   1  NVIDIA A40                   TCC   | 00000000:A2:00.0 Off |                    0 |
|  0%   26C    P8              12W / 300W |      1MiB / 46068MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
|   2  NVIDIA A40                   TCC   | 00000000:C0:00.0 Off |                    0 |
|  0%   27C    P8              12W / 300W |      1MiB / 46068MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+

I’m accessing the server with Remote Desktop, and I’m not sure about this, but I think that running dxdiag don’t show info from the GPUs themselves, but only for the MS Remote Display Adapter, which shows DirectDraw and AGP Texture acceleration as unavailable and DirectX 12 Ultimate as disabled, not sure why.

Everything runs correctly on the current server, and also on my own laptop with Windows 11, both having RTXs GPUs. The new server runs on Windows Server 2022. I suspect that some configuration on it is preventing the GPU from running everything that it can, but I’m not sure what this would be, and I checked that the following capabilities are installed:

DirectX.Configuration.Database~~~~0.0.1.0
Tools.Graphics.DirectX~~~~0.0.1.0

Any ideas? Thanks in advance!

droettger · September 18, 2023, 8:26am

There are multiple issues with that setup:
(The following assumes a bare-metal machine setup, not using virtual machines, hypervisors, and NVIDIA GRID setup which behaves differently.)

1.) The main problem:
According to your nvidia-smi output, all three of your A40 GPUs are running in Tesla Compute Cluster (TCC) driver mode and display adapters are all off.
That means no graphics driver is running in Windows Display Driver Mode (WDDM) on those GPUs and none of the graphics APIs like OpenGL, Direct3D or Vulkan can be used in that state. But the GPUs are all accessible as CUDA compute GPUs and ray tracing with OptiX would work on them.

nvidia-smi is able to switch between WDDM and TCC mode on professional workstation and compute graphics boards only (not GeForce).
Please read the nvidia-smi -h output and look for

 -dm,  --driver-model=       Enable or disable TCC mode: 0/WDDM, 1/TCC
 -fdm, --force-driver-model= Enable or disable TCC mode: 0/WDDM, 1/TCC

Needs administrator rights inside the command prompt.

Try if you can switch the A40 boards into WDDM mode.

2.) When you cannot switch to WDDM mode:
A40 boards are Data Center products setup for compute workloads. They are not enabled to run graphics by default. The firmware itself would need to be switched to graphics mode first.
You must contact your board vendor first to make sure that this is a supported configuration on your boards!
Otherwise that has the potential to brick your GPUs.

Read everything here! https://developer.nvidia.com/displaymodeselector

If that worked, go to 1.) and try switching the GPUs to WDDM mode with nvidia-smi again.

3.)
Windows Server systems might not be setup to run full graphics. At least that was the case in the past. I haven’t used more recent Windows Server versions. You might need to change some group policies on the server system in case this isn’t working. The Microsoft Windows Server documentation should have information about that.

4.)
After all that, accessing a system with Remote Desktop might not support full graphics capabilities.
That is running a separate desktop and in the past Microsoft didn’t render RDP connections using the GPU.
That’s exactly what happens in your current setup because there is no GPU running in WDDM.

That should work with professional workstation boards but I have not tried that on Windows Server systems. It works with my Windows 10 systems.
There might also be a frame rate capping to 30 fps under RDP which can be configured.
Again, search the Microsoft Windows Server documentation on that if you encounter issues with RDP.

5.)
I find it weird to use GeForce Experience for installing graphics drivers for Data Center boards on Windows Server configurations. I’m not sure if there is anything different between the driver packages, but I would recommend installing the Data Center drivers explicitly. Find them here: https://www.nvidia.com/Download/Find.aspx?lang=en-us

liordino · September 20, 2023, 3:12pm

Thanks, droettger! I’ll contact the board vendor to make sure that WDDM is supported on the boards. I’ll get back with the results.

Topic		Replies	Views
Introduction to NVIDIA RTX and DirectX Ray Tracing Technical Blog	16	552	December 31, 2018
Slow performance on Azure VM (NVS12v3) with Nvidia Tesla M60 (8GB) NVIDIA RTX Virtual Workstation (vWS) on CSP Market	10	3856	December 6, 2021
Rendering problems on drivers above 383 Drivers - Linux, Windows, MacOS	8	3436	January 15, 2018
Nvidia Quadro RTX 5000 support for TCC? CUDA Programming and Performance	12	3855	July 22, 2023
Nvidia-settings not Working on Ubuntu 20.04.2 System Management and Monitoring (NVML) ubuntu	0	1784	February 23, 2021
Nvidia-smi recognize H100 when Firmware is disable Confidential Computing cuda , ubuntu	10	477	September 11, 2024
testing with GPU in passthrough Monitoring/Assessment Tools	14	31509	October 14, 2014
5 out of 8 GPUs are not detected with nvidia-smi GPU - Hardware nvidia-smi	3	1285	March 31, 2023
GPU not detected Ubuntu Linux	35	98807	December 14, 2023
Broken GPU state query failure in AMD + H100 Confidential Computing	10	1011	February 15, 2024

Ray Tracing support unavailable on Windows Server 2022 with 3 NVidia A40 GPUs

Related topics