cudaMemcpy returns cudaErrorNoDevice fortran wrapper and cudaMemcpy

Hi All,

Having a strange issue with cudaMemcpy. I’m writing some C wrappers for F95. Before I go off and post a bunch of code, I’ll just explain for now. Essentially, cudaGetDeviceCount(…) returns 0, so I have a CUDA capable device (GTX 9500GT with 1Gb, so compute level 1.1 capable). cudaGetDevice(…) gives me 0. cudaSetDevice(…) returns cudaSuccess. So all looks well. But, cudaMalloc(…) and cudaMemcpy(…) both are returning cudaErrorNoDevice(…). Any suggestions on what I’m missing? And yes, I compiled the example Fortran to Cuda code from CUDA u and I have the same issues. The compiled test cases (pure C, C++) work perfectly. BTW, I’m using gfortran 4.3. No segfaults and no crashes.

Ok. I just tried on my webserver that has the same NVIDIA card and it works. So what am I missing? The difference in architectures is that one is a duo core CPU (the one that is not working) and the other just a single core. I’m running this in Linux. I remember reading somewhere that the GPU resources can be locked in certain configurations, but I can’t remember at the moment. Any pointers greatly appreciated.

-David

Ok, Here’s what’s up. The problem seems to be that the GPU eventually won’t context switch into compute mode if I I do a few bad things (like passing invalid pointers and generally crashing the code). If I reboot, it works just fine. Any pointers?

-David