bug in CUDA initialization? simple code cant see the device after xxx runs

I am plugging CUDA-based computation into matlab via mex.

Everything works fine, but after some number of repeated calls

there is an error returned by CUDA API:

no CUDA-capable device is available

I removed everything but a trivial part from the code and

the error is still present, so I believe it is a problem with CUDA.

I have CUDA 2.2

CUDA compilation tools, release 2.2, V0.2.1221

Microsoft ® 32-bit C/C++ Optimizing Compiler Version 15.00.21022.08 for 80x86

And the problem repeatedly occurs on two machines:

WinXP64, Quadro NVS 160M, CUDA Driver version 185.85

WinXP64, Geforce GTX 285, CUDA Driver version 185.85

Here is test_cuda_mex.cpp code

[codebox]

#include <cuda_runtime.h>

#include <mex.h>

void chk_error(){

cudaError_t err = cudaGetLastError();

if(err!=cudaSuccess){

	const char * err_str = cudaGetErrorString(err);

	mexErrMsgTxt(err_str);

};

};

void mexFunction(int nlhs, mxArray *plhs, int nrhs, const mxArray *prhs){

int deviceCount;

cudaError_t r = cudaGetDeviceCount(&deviceCount);

chk_error();

if(deviceCount==0){

	mexErrMsgTxt("No devices??\n");

};

for(int i=0;i<deviceCount;++i){

	cudaDeviceProp deviceProp;

	cudaGetDeviceProperties(&deviceProp, i);

	mexPrintf("%s\ ",deviceProp.name);

	chk_error();

};

mexPrintf("\n");



int dev;

cudaGetDevice(&dev);



chk_error();



void * V;

cudaMalloc((void**)&V, 100);

cudaFree(V);

chk_error();

};

void main(){

for(int i=0;i<1000;++i){

	mexFunction(0,0,0,0);

};

printf("TEST PASSED!\n");

};

[/codebox]

This is matlab script test_cuda.m:

[codebox]

@echo off

rem MSVC80OPTS.BAT

rem

rem Compile and link options used for building MEX-files

rem using the Microsoft Visual C++ compiler version 8.0

rem

rem $Revision: 1.1.10.2 $ $Date: 2006/06/23 19:04:53 $

rem

rem ************************************************************


rem General parameters

rem ************************************************************


set MATLAB=%MATLAB%

set VS80COMNTOOLS=%VS80COMNTOOLS%

set VSINSTALLDIR=%VS80COMNTOOLS%....

set VCINSTALLDIR=%VSINSTALLDIR%\VC

set PATH=%VCINSTALLDIR%\BIN;%VCINSTALLDIR%\PlatformSDK\bin;%VSINSTALLDIR%\Common7\IDE;%VSINSTALLDIR%\SDK\v2.0\bin;%VSINSTALLDIR%\Common7\Tools;%VSINSTALLDIR%\Common7\Tools\bin;%VCINSTALLDIR%\VCPackages;%MATLAB_BIN%;%PATH%

set INCLUDE=%VCINSTALLDIR%\ATLMFC\INCLUDE;%VCINSTALLDIR%\INCLUDE;%VCINSTALLDIR%\PlatformSDK\INCLUDE;%VSINSTALLDIR%\SDK\v2.0\include;%INCLUDE%

set LIB=%VCINSTALLDIR%\ATLMFC\LIB;%VCINSTALLDIR%\LIB;%VCINSTALLDIR%\PlatformSDK\lib;%VSINSTALLDIR%\SDK\v2.0\lib;%MATLAB%\extern\lib\win32;%LIB%

set MW_TARGET_ARCH=win32

rem ************************************************************


rem Compiler parameters

rem ************************************************************


set COMPILER=cl

set COMPFLAGS=/c /Zp8 /GR /W3 /EHsc- /Zc:wchar_t- /wd4996 /DMATLAB_MEX_FILE /nologo /O2

set OPTIMFLAGS=/MT /O2 /Oy- /DNDEBUG

set DEBUGFLAGS=/MT /Zi /Fd"%OUTDIR%%MEX_NAME%%MEX_EXT%.pdb"

set NAME_OBJECT=/Fo

rem ************************************************************


rem Linker parameters

rem ************************************************************


set LIBLOC=%MATLAB%\extern\lib\win32\microsoft

set LINKER=link

set LINKFLAGS=/dll /export:%ENTRYPOINT% /MAP /LIBPATH:“%LIBLOC%” libmx.lib libmex.lib libmat.lib /implib:%LIB_NAME%.x /MACHINE:X86 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib

set LINKOPTIMFLAGS=

set LINKDEBUGFLAGS=/DEBUG /PDB:“%OUTDIR%%MEX_NAME%%MEX_EXT%.pdb”

set LINK_FILE=

set LINK_LIB=

set NAME_OUTPUT=/out:“%OUTDIR%%MEX_NAME%%MEX_EXT%”

set RSP_FILE_INDICATOR=@

rem ************************************************************


rem Resource compiler parameters

rem ************************************************************


set RC_COMPILER=rc /fo “%OUTDIR%mexversion.res”

set RC_LINKER=

set POSTLINK_CMDS=del “%OUTDIR%%MEX_NAME%.map”

set POSTLINK_CMDS1=del %LIB_NAME%.x

set POSTLINK_CMDS2=mt -outputresource:“%OUTDIR%%MEX_NAME%%MEX_EXT%”;2 -manifest “%OUTDIR%%MEX_NAME%%MEX_EXT%.manifest”

set POSTLINK_CMDS3=del “%OUTDIR%%MEX_NAME%%MEX_EXT%.manifest”

[/codebox]

Environment variables matlab, CUDA_INC_PATH and CUDA_LIB_PATH must be properly set.

I red the post about reporting problems, but I’m not sure

if simply starting a new topic will be sufficient to be noticed by developers.

I also use the 32 bit CUDA SDK on 64 bit Windows XP Professional.

During development, I regularly have to reboot my machine because after a certain number of launches of programs (even SDK samples) the CUDA device fails to initialize its context.

This has to be related to some kind of memory starvation. It just runs out of resources to use.

Christian

You might have some problem with the driver…
Mine usually works weeks and weeks without rebooting, until the next MS patches

This problem was reported (with a proper problem report in the forums). cudaMalloc() fails after 14 or 15 application launches - no matter how much u allocate.

It was promised to be fixed. But it looks like, it never was. On retrospect, I made a prudent move when I moved back to XP 32-bit.

Im using Quadro CX card with CUDA2.2 Drivers, Toolkit and SDK on Windows XP 32bit 1GB RAM system.

It is working fine, but after some iterations cudaMalloc() is retuned an error “cudaErrorInvalidDevicePointer”.

simple code…

int* iPtr;
cudaError error = cudaMalloc( (void**)&iPtr, sizeof(int) );
if ( error != cudaSuccess ) error = cudaFree(iPtr);
else return error;

after some iterations cudaMalloc() is returned the above error.

is it bug in CUDA???

When error != cudaSuccess – why would you free it??? It is not even allocated in first place.
Also, you return “error” value only in success case. In failure case, you dont return anything…

sorry!, I have written the code wrongly, code is like this…

int* iPtr;

cudaError error = cudaMalloc( (void**)&iPtr, sizeof(int) );

if ( error == cudaSuccess ) error = cudaFree(iPtr);

else return error;

So, on success, do you return something??

btw, How do you say that failure is happening? Please post the entire code (i hope thats a small one) that is causing the problem. We can check that out.

Here is the rough code…

main(){
int result = EnlargeImage( inputFile );
return result;
}

int EnlargeImage( char* srcFile )
{
long srcW = 500;
long srcH = 300;
long dstW = 1500;
long dstH = 900;

int* iPtr;
cudaError error = cudaMalloc( (void**)&iPtr, sizeof(int)*2);
if ( error == cudaSuccess ) error = cudaFree(iPtr);
else return 21;

error = LoadInputImage(inputFile);
if ( error != cudaSuccess ) return 22 ;

error = ResizeImage(inputFile);
return 23;
}

The code is repeatedly called for n number of frames in a vedio file.
Code is returning 21 after execution of some X number of iterations.
21 , 22 and 23 numbers are used for just Identifying from where the code is returning.

the memory allocation for “iPtr” in the above code is to avoid “first cudaMalloc() will take some time”.

I recommend you write a small program (independent of images and resizing) that reproduces the problem and post it in the site with full source code

Please note that the initial post is about a different problem. There is no cudaMalloc() failures after thousands of runs of the application observed. But something wrong with loading and unloading a dll which links to CUDA.