Hi everyone !
We have an issue with our Tesla P40.
We have buy a used Dell R7525 with an Tesla P40 and have installed Proxmox and configured for PCI Passthrough.
We have configured a VM with the Tesla P40, installed Windows 2022 Standard Evaluation with Remote Desktop Session Host.
In our test with 15 concurrent users, every users was able to use the Tesla for 3D or video decoding like Youtube.
After that, we have moved our production VM, an Windows 2022 Standard with RDSH, from old ESXi server to Proxmox, but the Tesla P40 didn’t work as expected.
The Tesla P40 are detected by Windows, Grid drivers are installed correctly, there are no error in the device manager.
The Tesla P40 are not used by default.
If we tell Windows to use the Tesla P40 for a specific application (in Graphics settings), it’s work.
We have try to disable default GPU, set display to none in Proxmox, nothing change.
Is there a way to use the Tesla P40 as default GPU for any applications and all users ?
Thank you
Here is some more info with command line : nvidia-smi -q
I see on the working VM, the GPU Virtualization Mode is Pass-Through
On the non working VM, the GPU Virtualization Mode is None
Both VM run on the same Proxmox host, there are 2 nVidia Tesla P40 in the host
This is the output from a TS server where an nVidia Tesla P40 work
==============NVSMI LOG==============
Timestamp : Thu Jun 20 10:14:03 2024
Driver Version : 537.13
CUDA Version : 12.2
Attached GPUs : 1
GPU 00000000:01:00.0
Product Name : Tesla P40
Product Brand : Tesla
Product Architecture : Pascal
Display Mode : Enabled
Display Active : Disabled
Persistence Mode : N/A
Addressing Mode : N/A
MIG Mode
Current : N/A
Pending : N/A
Accounting Mode : Disabled
Accounting Mode Buffer Size : 4000
Driver Model
Current : WDDM
Pending : WDDM
Serial Number : ...
GPU UUID : ...
Minor Number : N/A
VBIOS Version : 86.02.23.00.00
MultiGPU Board : No
Board ID : 0x100
Board Part Number : 900-1G610-0030-000
GPU Part Number : 1B38-895-A1
FRU Part Number : N/A
Module ID : 1
Inforom Version
Image Version : G610.0300.00.01
OEM Object : 1.1
ECC Object : 4.1
Power Management Object : N/A
GPU Operation Mode
Current : N/A
Pending : N/A
GSP Firmware Version : N/A
GPU Virtualization Mode
Virtualization Mode : **Pass-Through**
Host VGPU Mode : N/A
vGPU Software Licensed Product
Product Name : NVIDIA Virtual Applications
License Status : Licensed
GPU Reset Status
Reset Required : No
Drain and Reset Recommended : N/A
IBMNPU
Relaxed Ordering Mode : N/A
PCI
Bus : 0x01
Device : 0x00
Domain : 0x0000
Device Id : 0x1B3810DE
Bus Id : 00000000:01:00.0
Sub System Id : 0x11D910DE
GPU Link Info
PCIe Generation
Max : 3
Current : 3
Device Current : 3
Device Max : 3
Host Max : **N/A**
Link Width
Max : 16x
Current : 16x
Bridge Chip
Type : N/A
Firmware : N/A
Replays Since Reset : 0
Replay Number Rollovers : 0
Tx Throughput : 0 KB/s
Rx Throughput : 24000 KB/s
Atomic Caps Inbound : N/A
Atomic Caps Outbound : N/A
Fan Speed : N/A
Performance State : P8
Clocks Event Reasons
Idle : Active
Applications Clocks Setting : Not Active
SW Power Cap : Not Active
HW Slowdown : Not Active
HW Thermal Slowdown : Not Active
HW Power Brake Slowdown : Not Active
Sync Boost : Not Active
SW Thermal Slowdown : Not Active
Display Clock Setting : Not Active
FB Memory Usage
Total : 24576 MiB
Reserved : 307 MiB
Used : 155 MiB
Free : 24113 MiB
BAR1 Memory Usage
Total : 32768 MiB
Used : 2 MiB
Free : 32766 MiB
Conf Compute Protected Memory Usage
Total : N/A
Used : N/A
Free : N/A
Compute Mode : Default
Utilization
Gpu : 2 %
Memory : 0 %
Encoder : 0 %
Decoder : 0 %
JPEG : N/A
OFA : N/A
Encoder Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
FBC Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
ECC Mode
Current : Disabled
Pending : Disabled
ECC Errors
Volatile
Single Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Texture Shared : N/A
CBU : N/A
Total : N/A
Double Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Texture Shared : N/A
CBU : N/A
Total : N/A
Aggregate
Single Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Texture Shared : N/A
CBU : N/A
Total : N/A
Double Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Texture Shared : N/A
CBU : N/A
Total : N/A
Retired Pages
Single Bit ECC : 0
Double Bit ECC : 0
Pending Page Blacklist : No
Remapped Rows : N/A
Temperature
GPU Current Temp : 23 C
GPU T.Limit Temp : N/A
GPU Shutdown Temp : 95 C
GPU Slowdown Temp : 92 C
GPU Max Operating Temp : N/A
GPU Target Temperature : N/A
Memory Current Temp : N/A
Memory Max Operating Temp : N/A
GPU Power Readings
Power Draw : 14.09 W
Current Power Limit : 250.00 W
Requested Power Limit : 250.00 W
Default Power Limit : 250.00 W
Min Power Limit : 125.00 W
Max Power Limit : 250.00 W
Module Power Readings
Power Draw : N/A
Current Power Limit : N/A
Requested Power Limit : N/A
Default Power Limit : N/A
Min Power Limit : N/A
Max Power Limit : N/A
Clocks
Graphics : 544 MHz
SM : 544 MHz
Memory : 405 MHz
Video : 544 MHz
Applications Clocks
Graphics : 1303 MHz
Memory : 3615 MHz
Default Applications Clocks
Graphics : 1303 MHz
Memory : 3615 MHz
Deferred Clocks
Memory : N/A
Max Clocks
Graphics : 1531 MHz
SM : 1531 MHz
Memory : 3615 MHz
Video : 1379 MHz
Max Customer Boost Clocks
Graphics : 1531 MHz
Clock Policy
Auto Boost : N/A
Auto Boost Default : N/A
Voltage
Graphics : N/A
Fabric
State : N/A
Status : N/A
Processes
...
And the output from a TS server where the card doesn’t work
==============NVSMI LOG==============
Timestamp : Thu Jun 20 10:12:05 2024
Driver Version : 537.13
CUDA Version : 12.2
Attached GPUs : 1
GPU 00000000:01:00.0
Product Name : Tesla P40
Product Brand : Tesla
Product Architecture : Pascal
Display Mode : Enabled
Display Active : Disabled
Persistence Mode : N/A
Addressing Mode : N/A
MIG Mode
Current : N/A
Pending : N/A
Accounting Mode : Disabled
Accounting Mode Buffer Size : 4000
Driver Model
Current : WDDM
Pending : WDDM
Serial Number : ...
GPU UUID : ...
Minor Number : N/A
VBIOS Version : 86.02.23.00.00
MultiGPU Board : No
Board ID : 0x100
Board Part Number : 690-1G610-0300-000
GPU Part Number : 1B38-895-A1
FRU Part Number : N/A
Module ID : 1
Inforom Version
Image Version : G610.0300.00.01
OEM Object : 1.1
ECC Object : 4.1
Power Management Object : N/A
GPU Operation Mode
Current : N/A
Pending : N/A
GSP Firmware Version : N/A
GPU Virtualization Mode
Virtualization Mode : **None**
Host VGPU Mode : N/A
vGPU Software Licensed Product
Product Name : NVIDIA Virtual Applications
License Status : Licensed
GPU Reset Status
Reset Required : No
Drain and Reset Recommended : N/A
IBMNPU
Relaxed Ordering Mode : N/A
PCI
Bus : 0x01
Device : 0x00
Domain : 0x0000
Device Id : 0x1B3810DE
Bus Id : 00000000:01:00.0
Sub System Id : 0x11D910DE
GPU Link Info
PCIe Generation
Max : 3
Current : 3
Device Current : 3
Device Max : 3
Host Max : **4**
Link Width
Max : 16x
Current : 16x
Bridge Chip
Type : N/A
Firmware : N/A
Replays Since Reset : 0
Replay Number Rollovers : 0
Tx Throughput : 0 KB/s
Rx Throughput : 0 KB/s
Atomic Caps Inbound : N/A
Atomic Caps Outbound : N/A
Fan Speed : N/A
Performance State : P8
Clocks Event Reasons
Idle : Active
Applications Clocks Setting : Not Active
SW Power Cap : Not Active
HW Slowdown : Not Active
HW Thermal Slowdown : Not Active
HW Power Brake Slowdown : Not Active
Sync Boost : Not Active
SW Thermal Slowdown : Not Active
Display Clock Setting : Not Active
FB Memory Usage
Total : 24576 MiB
Reserved : 307 MiB
Used : 887 MiB
Free : 23380 MiB
BAR1 Memory Usage
Total : 32768 MiB
Used : 2 MiB
Free : 32766 MiB
Conf Compute Protected Memory Usage
Total : N/A
Used : N/A
Free : N/A
Compute Mode : Default
Utilization
Gpu : 0 %
Memory : 0 %
Encoder : 0 %
Decoder : 0 %
JPEG : N/A
OFA : N/A
Encoder Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
FBC Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
ECC Mode
Current : Disabled
Pending : Disabled
ECC Errors
Volatile
Single Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Texture Shared : N/A
CBU : N/A
Total : N/A
Double Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Texture Shared : N/A
CBU : N/A
Total : N/A
Aggregate
Single Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Texture Shared : N/A
CBU : N/A
Total : N/A
Double Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Texture Shared : N/A
CBU : N/A
Total : N/A
Retired Pages
Single Bit ECC : 0
Double Bit ECC : 0
Pending Page Blacklist : No
Remapped Rows : N/A
Temperature
GPU Current Temp : 24 C
GPU T.Limit Temp : N/A
GPU Shutdown Temp : 95 C
GPU Slowdown Temp : 92 C
GPU Max Operating Temp : N/A
GPU Target Temperature : N/A
Memory Current Temp : N/A
Memory Max Operating Temp : N/A
GPU Power Readings
Power Draw : 9.89 W
Current Power Limit : 250.00 W
Requested Power Limit : 250.00 W
Default Power Limit : 250.00 W
Min Power Limit : 125.00 W
Max Power Limit : 250.00 W
Module Power Readings
Power Draw : N/A
Current Power Limit : N/A
Requested Power Limit : N/A
Default Power Limit : N/A
Min Power Limit : N/A
Max Power Limit : N/A
Clocks
Graphics : 544 MHz
SM : 544 MHz
Memory : 405 MHz
Video : 544 MHz
Applications Clocks
Graphics : 1303 MHz
Memory : 3615 MHz
Default Applications Clocks
Graphics : 1303 MHz
Memory : 3615 MHz
Deferred Clocks
Memory : N/A
Max Clocks
Graphics : 1531 MHz
SM : 1531 MHz
Memory : 3615 MHz
Video : 1379 MHz
Max Customer Boost Clocks
Graphics : 1531 MHz
Clock Policy
Auto Boost : N/A
Auto Boost Default : N/A
Voltage
Graphics : N/A
Fabric
State : N/A
Status : N/A
Processes
...
Hi,
I don’t think there will be much help on Proxmox as we don’t support Proxmox as hypervisor yet. Neither would I expect the P40 is supported in the given server! It is crucial to use certified hardware .
Download NVIDIA GRID datasheets, guides, solution overviews, white papers, and success stories. Watch GRID videos, webinars, and webcasts.
Best regards
Simon
Like I say in my first post, the GPU work.
In my test VM, the P40 was used by default for all applications
In me production VM, I need to tell Windows to use the P40
[HKEY_CURRENT_USER\Software\Microsoft\DirectX\UserGpuPreferences]
"C:\\Program Files (x86)\\Microsoft\\Edge\\Application\\msedge.exe"="GpuPreference=2;"
Why do I need to set GpuPreference in the production VM, but not in the test VM ?
Does there are a global parameters to force all applications to use the P40 ?