Hi all, I am having some problems. I haven’t gotten to the bottom of it yet, but maybe someone has some pointers/tips
I am using the matlab_cuda tool to offload some calculations to CUDA. I am having good experiences up to now. But now I have hit a glitch.
In my Cuda code I allocate 2 cudaArrays, copy some video to it and process it. The cuda mexa64 (dll in windows) is being called from a matlab loop. The first time everything goes ok, but the second time, the allocation of the first cudaArray fails and matlab crashes as my code at the moment tries to copy the data anyway.
The thing is, my code does something along these lines:
// Copy video to textures float *f_zd_real, *f_zd_imag; f_zd_real = (float *) mxMalloc(elements * sizeof(float)); convert_double2float(zd_real, f_zd_real, elements); f_zd_imag = (float *) mxMalloc(elements * sizeof(float)); convert_double2float(zd_imag, f_zd_imag, elements); cudaChannelFormatDesc channelDesc = cudaCreateChannelDesc(32, 0, 0, 0, cudaChannelFormatKindFloat); cudaArray *re_array, *im_array; CUDA_SAFE_CALL( cudaMallocArray( &re_array, &channelDesc, sizex, sizey )); CUDA_SAFE_CALL( cudaMemcpyToArray( re_array, 0, 0, f_zd_real, elements * sizeof(float), cudaMemcpyHostToDevice)); CUDA_SAFE_CALL( cudaMallocArray( &im_array, &channelDesc, sizex, sizey)); CUDA_SAFE_CALL( cudaMemcpyToArray( im_array, 0, 0, f_zd_imag, elements * sizeof(float), cudaMemcpyHostToDevice)); // set texture parameters zd_re.addressMode = cudaAddressModeClamp; zd_re.addressMode = cudaAddressModeWrap; zd_re.filterMode = cudaFilterModePoint;//cudaFilterModeLinear; zd_re.normalized = false; // access without normalized texture coordinates zd_im.addressMode = cudaAddressModeClamp; zd_im.addressMode = cudaAddressModeWrap; zd_im.filterMode = cudaFilterModePoint;//cudaFilterModeLinear; zd_im.normalized = false; // access without normalized texture coordinates // Bind the array to the texture CUDA_SAFE_CALL( cudaBindTextureToArray( zd_re, re_array, channelDesc)); CUDA_SAFE_CALL( cudaBindTextureToArray( zd_im, im_array, channelDesc)); <here I call my kernel> CUDA_SAFE_CALL(cudaUnbindTexture(zd_re)); CUDA_SAFE_CALL(cudaUnbindTexture(zd_im)); CUDA_SAFE_CALL(cudaFreeArray(re_array)); CUDA_SAFE_CALL(cudaFreeArray(im_array)); mxFree(f_zd_real); mxFree(f_zd_imag);
So the memory is freed when I return to matlab, and the second time I would expect the call to not fail (since the first time there was enough memory for 2 arrays)
Anybody have any idea what I might be doing wrong?
I am on a 8600M GT 512 Mb at this time, but 512 Mb should be plenty enough