Windows Server 2012 R2 - Cuda 6.5.19 and Cuda 7.0.28 - Erroor Code 38

So, I have racked my brain over the last 2 days trying to get CUDA to work on our new PRODUCTION SERVER with windows server 2012 R2.

Cuda has worked and installed fine on the other 9 machines (windows 7 and windows 8)

This is a FRESH INSTALL as of 2 days ago(windows).

I started with a gtx 980 and CUDA 7.
NVCC -V showed the correct version of cuda when i ran it.

Nvidia-SMI.exe -a Showed the correct Nvidia Card.

Goto run the sample for deviceQuery.exe and i get
(CudaErrorNoDevice).

Ok, that didn’t work… lets try something else…
Unistalled , deleted drivers, cleaned system, rebooted many times…

On to Cuda 6.5.19 (the 9xx series capable version of 6.5)
Same results as above (ran nvcc, ran nvidia-smi, ran deviceQuery.exe)
except for the difference in the cuda version.

Ok… Lets try a differnt card.

Replaced the 980 with a 780, Same issue.
Card is fine in windows device manager (no errors).
I have tried several different versions of CUDA DRIVERS Including the same exact driver thats working on a windows 7 machine with the another 780 card.

PLEASE HELP ME!
Months of work will be down the drain if i cannot get this to work on our production server…

Are you logged into this server remotely, perhaps via RDP ?

I use Log Me In (On every other tested machine as well). It has not caused a problem before.

RDP on the other hand has always caused an issue (part of the reason i use Log Me In)

Can you attach an actual keyboard and display to the server and try that way?

In all of this, you haven’t mentioned the actual driver you were using.

It would have been reported when you ran nvidia-smi

You’ll need to pick a starting point. What is on the machine now?

Yes, The same thing happens At the physical machine (while all remote connections are close). Only the current user is logged in.

Ok i will stop messing with stuff trying to figure this out and keep with one configuration. After the latest clean/install im using Cuda7 and the 980… Below are some information printouts.

Microsoft Windows [Version 6.3.9600]
(c) 2013 Microsoft Corporation. All rights reserved.

C:\Windows\system32>nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2015 NVIDIA Corporation
Built on Mon_Feb_16_23:00:53_CST_2015
Cuda compilation tools, release 7.0, V7.0.27

C:\Windows\system32>

C:\Program Files\NVIDIA Corporation\NVSMI>nvidia-smi.exe -a

==============NVSMI LOG==============

Timestamp : Wed Jun 03 15:54:23 2015
Driver Version : 347.62

Attached GPUs : 1
GPU 0000:01:00.0
Product Name : GeForce GTX 980
Product Brand : GeForce
Display Mode : N/A
Display Active : N/A
Persistence Mode : N/A
Accounting Mode : N/A
Accounting Mode Buffer Size : N/A
Driver Model
Current : WDDM
Pending : WDDM
Serial Number : N/A
GPU UUID : GPU-b22f9d3f-a07a-74ed-d2bf-fa668ca78643
Minor Number : N/A
VBIOS Version : 84.04.31.00.82
MultiGPU Board : N/A
Board ID : N/A
Inforom Version
Image Version : N/A
OEM Object : N/A
ECC Object : N/A
Power Management Object : N/A
GPU Operation Mode
Current : N/A
Pending : N/A
PCI
Bus : 0x01
Device : 0x00
Domain : 0x0000
Device Id : 0x13C010DE
Bus Id : 0000:01:00.0
Sub System Id : 0x29833842
GPU Link Info
PCIe Generation
Max : N/A
Current : N/A
Link Width
Max : N/A
Current : N/A
Bridge Chip
Type : N/A
Firmware : N/A
Replays since reset : 0
Tx Throughput : 0 KB/s
Rx Throughput : 0 KB/s
Fan Speed : 0 %
Performance State : P8
Clocks Throttle Reasons : N/A
FB Memory Usage
Total : 4095 MiB
Used : 4000 MiB
Free : 95 MiB
BAR1 Memory Usage
Total : N/A
Used : N/A
Free : N/A
Compute Mode : Default
Utilization
Gpu : N/A
Memory : N/A
Encoder : N/A
Decoder : N/A
Ecc Mode
Current : N/A
Pending : N/A
ECC Errors
Volatile
Single Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Total : N/A
Double Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Total : N/A
Aggregate
Single Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Total : N/A
Double Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Total : N/A
Retired Pages
Single Bit ECC : N/A
Double Bit ECC : N/A
Pending : N/A
Temperature
GPU Current Temp : 39 C
GPU Shutdown Temp : N/A
GPU Slowdown Temp : N/A
Power Readings
Power Management : N/A
Power Draw : N/A
Power Limit : N/A
Default Power Limit : N/A
Enforced Power Limit : N/A
Min Power Limit : N/A
Max Power Limit : N/A
Clocks
Graphics : N/A
SM : N/A
Memory : N/A
Applications Clocks
Graphics : N/A
Memory : N/A
Default Applications Clocks
Graphics : N/A
Memory : N/A
Max Clocks
Graphics : N/A
SM : N/A
Memory : N/A
Clock Policy
Auto Boost : N/A
Auto Boost Default : N/A
Processes : N/A

C:\Program Files\NVIDIA Corporation\NVSMI>


Normally at this point i run a sample from the library to test if its working. But Cuda 7 unlike Cuda 6.5 doesn’t come with anything in the bin/Release directory of the samples. So there is nothing to run. The solution doesn’t open up on my server.

so I compiled the deviceQuery.exe from the Cuda7 samples on another machine and i get an error saying no cuda capable devices were found.

This is as far as i have gotten. I have tried Varius Driver version’s and GPU’s (Doing clean driver uninstalls and reboots before switching anything up).

I Really appreciate you taking the time to help me txbob.

You may need to compile the samples on that machine.

You may also need to run any sample as an administrator, if you are not already doing that.

I am running them as admin (under console) i also saw a post with this as a possible issue. So i set the program to Run As Admin. No change in the result.

I don’t see why i would need to compile on that machine i have Cuda 7 on my Dev machine. I build for release x64. works on my machine fine… should work on the server machine.

I Could really use some help here. My next option is to Format and try a fresh install again which is probably another 12 hours of wasted time.

Any Help is appreciated. Anything to help point me in the direction of what the problem might be.

I’m probably not going to be able to help much until I can find the time to grab a copy of Win Server 2012 R2 and set up a system. I don’t know when I will get to that. In the past, I’ve set up systems using Win Server 2008 R2 and Win Server 2012 without issues like you’re describing.

I’m a little skeptical of moving exe’s from machine to machine. I don’t doubt that it can be done in theory, but when I’ve tried to casually build programs in visual studio and even move them to a non-visual studio equipped machine but otherwise identical OS, I’ve run into trouble for example with WinSxS that took a while to sort out (ie. get visual studio build settings correct). And moving from perhaps a win7 dev machine to a WinServer2012 R2 machine (i.e. different OS) seems like it might be more trouble. Therefore I think it would be useful to do an install of CUDA 7 and then visual stuidio 2013 community edition (which is free, basically) and build the samples on the machine in question. But I acknowledge your point that it should be possible to build on another machine and use the exe on your server.

If I were going to move a cuda sample from one machine to another, the one I would try is deviceQuerydrv example not deviceQuery. deviceQuerydrv depends only on the GPU driver installed on the machine, and not any other “typical” cuda libraries such as cudart. That should be “more portable”. However the error code 38 you are receiving appears to be a legitimate error code reported by the driver, not directly by cudart.

So I don’t hold out much hope for either of the two above suggestions. I suspect the problem is somewhere else. normally nvidia-smi is a good indicator of a usable setup, and your nvidia-smi results look OK to me. That means cuda should be usable.

Thank you very much for your help. As i said I have used the same setup on other machines just under a different OS. (win 7, win 8, win 8.1, and i think even an XP machine). My software runs fine on all of these machines (with various CUDA installs. 6.5 and 7.0). But only after CUDA is working with the basic samples.

This is the First machine i have had any issues with since i first started working with CUDA 6 months ago. Usually it works very well.

I do understand what your saying about sometimes things not working on machines when they are built on a different machine. Sometimes libraries are missing etc. I highly doubt that is the case. The deviceQuery.exe runs without any executable errors. I would expect the exe to not run or Crash if there was an issue using the sample built on another machine. (also i tested the same exeon the other machines that CUDA works on. they all work fine and show the device info).

I grabbed an iso image copy of windows server 2012 R2 from microsoft technet evaluation center. I did a fresh install and allowed the server to go through a default configuration process without doing any updates.

At that point I loaded the windows CUDA toolkit installer (cuda_7.0.28_windows.exe). It complained about no visual studio found, but otherwise the install was complete and successful.

After that I went to a separate win7 x64 machine and built the (Debug) deviceQueryDrv sample code (this is different than the deviceQuery sample) using VS2013 community and CUDA 7, and transferred the deviceQueryDrv.exe that was built to my new win server 2012 R2 machine. I executed that file and it successfully ran and reported the GPU info correctly.

I then repeated the process with the regular deviceQuery sample code. I built it on the win7 machine, moved the deviceQuery.exe sample to the win server 2012 R2 machine, and ran it there. It ran successfully and reported the GPU info correctly.

I did all this using an attached keyboard and mouse, and display attached directly to the NVIDIA GPU on the server machine. I did all of this using an adminstrative account/user.

So I’m afraid I don’t know what problem you are running into, exactly.

I am at a loss. Both Executable fail on my server. I don’t have access to the physical machine to try a reinstall of windows yet. Once i do i will let you know if it resolves the issues. Sometimes its something silly that its just about impossible to figure out and the only way to resolve the issue… is to start over.

Well, In case anyone else runs into stupid issues, Simply format and re-install…

5 days down the drain because of something i cannot even explain. Now i get to spend another 2-3 days getting the server set up with all of the software and updates :(