I am wondering why I can’t use more than 120MB VRAM on a GT330 with 256MB VRAM.
I tried also the deviceQuery program from the SDK examples, which also states that
there are 256MB global memory available.
I used the following program to test how much memory I can upload to the GPU:
[codebox]#include <cutil_inline.h>
#include <cutil_math.h>
global void testKernel(float *data)
{
}
#define DATA_MB 130u
#define DATA_SIZE ((DATA_MB * 1024u * 1024u) / sizeof(float))
int main()
{
printf("Transfering %.2fMB data to VRAM\n", (DATA_SIZE * sizeof(float)) / (1024.0 * 1024.0));
float *h_data = new float[DATA_SIZE];
for (unsigned int i=0; i < DATA_SIZE; ++i)
h_data[i] = drand48();
float *d_data;
cutilSafeCall(cudaMalloc((void **)&d_data, DATA_SIZE * sizeof(float)));
cutilSafeCall(cudaMemcpy(d_data, h_data, DATA_SIZE * sizeof(float), cudaMemcpyHostToDevice));
printf("Invoking kernel\n");
dim3 blckDim = dim3(16, 16, 1);
dim3 gridDim = dim3(4, 4, 1);
testKernel<<<gridDim, blckDim>>>(d_data);
printf("Cleanup\n");
cutilSafeCall(cudaFree(d_data));
delete[] h_data;
printf("Checking error");
cudaError_t err = cudaGetLastError();
if(cudaSuccess != err)
{
fprintf(stderr, "Cuda error: %s\n", cudaGetErrorString(err));
return -1;
}
return 0;
}[/codebox]
By trying to use more than 120MB, the program always crashes with a ‘Out of memory’ exception.
I guess, there is nothing wrong with the code snippet above? Can anyone clear up the situation?
Thanks in advance.