Problem with CudaMemCpy in CUDA 1.1

formuminator · April 16, 2008, 12:27pm

Hi all,

I have a DELL Latitude D630 with a Quadro NVS 135M. I’ve waited so much time from DELL to post “DELL laptop compliant” NVidia drivers that support CUDA 1.1 (version >169…) .

Here they are, in version 174.31 ! I’ve installed them, and the new CUDA toolkit and SDK.

Unfortunately, the CUDA samples do not work any more in that version. The samples freeze 2-3 seconds on CUDA_DEVICE_INIT…
And all the tests done in that samples fail. The samples with 3D don’t display anything.

I’ve written a simple code to compare array values before and after a copy from host to device and back from device to host. This test fails every time.

float* h_idata1 = (float*) malloc(mem_size);
float* h_idata2 = (float*) malloc(mem_size);

float* d_idata;
CUDA_SAFE_CALL( cudaMalloc( (void**) &d_idata, mem_size));

CUDA_SAFE_CALL( cudaMemcpy( d_idata, h_idata1, mem_size,
cudaMemcpyHostToDevice) );
CUDA_SAFE_CALL( cudaMemcpy( h_idata2, d_idata, mem_size,
cudaMemcpyDeviceToHost) );

for( unsigned int i = 0; i < (size_x * size_y); ++i) {
if ((h_idata2[i]>h_idata1[i]+0.1f) || (h_idata2[i]<h_idata1[i]-0.1f)) {
printf(“memcpy test : FAILED\n”);
break;
}
}

Have you an idea about that issue ? Is there something wrong about support of CUDA on Quadro NVS drivers ?
And how can i check my CUDA installation (what are the DLLs used… ?)

e.ping · April 16, 2008, 5:17pm

You likely only have 128MB of video memory. That isn’t really sufficient to use CUDA. What is reported when you run the deviceQuery SDK sample?

formuminator · April 17, 2008, 9:32am

This is the output of deviceQuery :

Major revision number: 1

Minor revision number: 1

Total amount of global memory: 133890048 bytes

Total amount of constant memory: 65536 bytes

Total amount of shared memory per block: 16384 bytes

Total number of registers available per block: 8192

Warp size: 32

Maximum number of threads per block: 512

Maximum sizes of each dimension of a block: 512 x 512 x 64

Maximum sizes of each dimension of a grid: 65535 x 65535 x 1

Maximum memory pitch: 262144 bytes

Texture alignment: 256 bytes

Clock rate: 800000 kilohertz

In my sample code, i do a copy of a 32x128 float array. That’s only 32x128x4 = 16394 bytes.

formuminator · April 17, 2008, 9:53am

CUDA 1.0 works fine on my machine. Is there any change about memory consumption in CUDA 1.1 ?

And one more question : how can i check that CUDA is well installed on my machine ?

formuminator · May 16, 2008, 5:35am

But the Quadro NVS-135M is in the list of the GPU that support CUDA…

Where can i find informations about CUDA memory limitations ?