CUDA 9.2 and Windows Server 2012 R2 64-bit driver?

Hi,

The CUDA 9.2 Installation Guide lists support for Windows Server 2012 R2. Where can I find a Quadro driver for that CUDA/OS version (64-bit)?

The Download Drivers website doesn’t seem to find any versions recent enough to support CUDA 9.2 (certified or beta) (for K/M/P6000).

Thanks,
T

The driver bundled with the CUDA 9.2 toolkit installer should work with Quadro K/M/P6000

[url]https://developer.nvidia.com/compute/cuda/9.2/Prod2/local_installers2/cuda_9.2.148_windows[/url]

I tried installing the SDK you linked (only the driver) but it still doesn’t work. It installs 398.75, which should support CUDA 9.2. Running a simple CUDA 9.2 application on Windows Server 2012 R2 64-bit with K6000 and 398.75 results in error 30 at first CUDA call (cudaGetDeviceCount).

Poking around on the Download Drivers site it seems like this is a Tesla-only driver, which might be a problem?

Do you know when there will be an ODE driver released for all Kepler GPUs and later that supports CUDA 9.2? I checked GRID K2 and it seems like 370.28 is the latest for eg Win 10 64-bit.

No, it’s not a Tesla-only driver.

I’m not able to comment on future releases.

Ok, thanks. Do you have any recommendation on how to proceed to figure out what’s going wrong?

Running the deviceQuery sample application from the CUDA 9.2 SDK (built with deviceQuery_vs2015.sln in Release using VS2017 without retargeting etc) yields:

deviceQuery.exe Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount returned 30
-> unknown error
Result = FAIL

I’m confused by a number of things you have said:

why would you install only the driver? Why not the whole CUDA toolkit?

Why would you build with a vs2015 solution file on vs2017? Why not use the vs2017 solution file?

Did you actually build this code on the target machine?

Some diagnostic questions:

  1. Does the driver appear to be installed properly? For example, in device manager, are there any bangs on the display adapter or any indication that it is not working properly?

  2. What is the output of:

nvidia-smi

when run on the target machine?

  1. How are you accessing the target machine? Are you logged in directly via a console on the machine, or are you using some sort of remoting method to access the machine over the network?

Sorry for the confusion, I’ll try to explain.

The Win2012R2 machine I would like to get working running CUDA 9.2 apps is an automated testing server. It just needs a driver to run stuff built for CUDA 9.2, no need for the CUDA SDK (or VS etc).

I am building deviceQuery on my local dev machine where CUDA 9.2 works great (SDK+driver). The convoluted build of deviceQuery was an attempt to match our current toolchain. The same error occurs also when running the deviceQuery application built with deviceQuery_vs2017.sln in VS2017.

  1. The machine seems to be fine. No funny stuff in Device Manager.

Tue Sep 04 16:55:07 2018       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 398.75                 Driver Version: 398.75                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name            TCC/WDDM | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Quadro K6000       WDDM  | 00000000:03:00.0 Off |                  Off |
| 26%   31C    P8    19W / 225W |    412MiB / 12288MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0       160    C+G   Insufficient Permissions                   N/A      |
|    0      4148    C+G   C:\Windows\Explorer.EXE                    N/A      |
|    0      5476    C+G   Insufficient Permissions                   N/A      |
|    0      5564    C+G   Insufficient Permissions                   N/A      |
+-----------------------------------------------------------------------------+
  1. It’s a headless machine in a datacenter. We have reproduced the problem using Remote Desktop and also Powershell remoting.

I’m having the exact same issue with my Tesla K40c on my Windows Server 2012r2 machine which has a display connected to a Quadro K600. Everything that queries device information via other methods works fine (Device Manager, NVIDIA Control Panel, GPU-Z), but running any CUDA API application (including the statically-built deviceQuery.exe that ships with the SDK) returns an Unknown Error for every CUDA API call.

Daniel: Thanks for reproducing the issue!

More data from the last couple of days (with pre-built deviceQuery from the SDK):

  • Local console access doesn’t solve the problem
  • CUDA 9.0 and 9.1 also doesn’t work
  • Other Win2012R2 machines work (one with P1000 and one with M6000-24GB)

It might be that the first Win2012R2 install is broken. Or maybe that CUDA 9.x doesn’t play well with Kepler cards on that OS.

I was able to reproduce the issue. I did a fresh install of CUDA 9.2.148 on Win2012 R2 Standard w/GUI with a Quadro K5000 (kepler family).

Working from powershell console on the server, with a display attached to the Quadro card, driver 398.75 appears to be installed correctly and nvidia-smi reports 398.75 and K5000 correctly.

However cuda sample apps return unknown error as already indicated.

When I switch to a maxwell family card, the errors go away and the setup behaves normally.

I’ve filed an internal bug but don’t have any more details/information than that.

If you need a short-term fix, I suggest either:

  1. Reverting to CUDA 9.0 or 9.1 and the 392.00 driver that is currently available at Official Drivers | NVIDIA
  2. switch your GPU from a kepler to maxwell/pascal, etc.

(Yes, I acknowledge you’ve already stated that CUDA 9.0/9.1 also don’t work. I haven’t tested my suggestion above, but if your statement is based on testing 9.0/9.1 with 398.75, it seems reasonable it may not work. If you’ve already tested 9.0/9.1 with the current public driver for Quadro for this OS - 392.00, then disregard my suggestion 1 and I refer you to my suggestion 2.)

Thanks for taking the time to look into this!

When trying CUDA 9.0/9.1 above I did so with 391.89. 392.00 doesn’t work either. We can change GPU temporarily to get automated testing to work on this machine.

I realize you can’t say when/if there will be a working driver/SDK. However, our release is in 1-2 months and we have customers using these OS/GPU combinations (K6000 and K5000) - it will be quite bad for us if the fix is later than that.

Any update on this issue?

It’s still very much a problem for us and occurs also with the 411.63 driver (K6000 and 2012R2).

The bug that I filed hasn’t made any forward progress. My suggestion would be to file a bug yourself at developer.nvidia.com

You’ll then at least be aware of the bug status.