Hi,
are there differences between GeForce 310M and Tesla Devices(Tesla C1060 or Tesla C2050) if i use cudaMalloc?
I want do allocate the whole global memory of the device.
Example for GeForce 310M:
my code
cudaMalloc(mem,1054437333*sizeof(char));
for allocate memory(btw.1054437333 != 1073479680 but one byte more the allocate failed)
to get some information about the memory status.
Output:
GPU memory usage: used = 1023.750000, free = 0.000000 MB, total = 1023.750000 MB <<< OK :) that is what i want.
But with a Tesla Device(Tesla C1060 or Tesla C2050) i can´t allocate memory.
Example for Tesla C1060:
my code
cudaMalloc(mem,3294770688*sizeof(char));
for allocate memory
to get some information about the memory status.
Output:
GPU memory usage: used = 40.746338, free = 4055.066162 MB, total = 4095.812500 MB FAIL :( used should be 3,xxx GIG
Greets
Sven
cudaMalloc(mem,3294770688*sizeof(char));
A “naked” constant like that will be treated as a signed integer, which has a maximum value of 2147483648. So you are probably the victim of integer overflow. Try explicitly casting the constant to a size_t, or specifying the constant as an unsigned long, so either:
cudaMalloc(mem,sizeof(3294770688)*sizeof(char));
or
cudaMalloc(mem,3294770688ul*sizeof(char));
and see what happens.
Both solutions not worked. :( The output is always used 40.746338, free = 4055.066162 MB, total = 4095.812500 MB…
What operating system is this on? And what status are the cudaMalloc calls returning?
//edit sorry it is not cudaSuccess!!! I get Segmentation fault.
cudaError_t cuda_status2 = cudaMalloc(mem,3004437333ul*sizeof(char)); <<< Seg_Fault follows
cudaMalloc(mem,3004437333ul*sizeof(char)); No Seg_Fault
[b]//edit2
cudaError_t cuda_status2 << second var with cudaError_t caused the error. Seg_Fault.
Status is cudaSuccess!!!
printf(cudaGetErrorString(cuda_status)) → no error!!!
[/b]
System is Ubuntu Server.
Linux cuda 2.6.35-22-server #33-Ubuntu SMP Sun Sep 19 20:48:58 UTC 2010 x86_64 GNU/Linux
nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2010 NVIDIA Corporation
Built on Wed_Nov__3_16:16:57_PDT_2010
Cuda compilation tools, release 3.2, V0.2.1221
CUDA Driver Version: 3.20
CUDA Runtime Version: 3.20
CUDA Capability Major/Minor version number: 1.3
With followong code i get
int a=0;
while(a<140){
double inputData[999999];
int *dev_inputData;
cudaMalloc((void**)&dev_inputData,999999*sizeof(double));
cudaMemcpy(dev_inputData,inputData,999999*sizeof(double),cudaMemcpyHostToDevice);
a++;
}
GPU memory usage: used = 1116.938232, free = 2978.874268 MB, total = 4095.812500 MB
but the Code
void** mem;
void* dev_mem;
cudaMalloc(mem,(999999*140)*sizeof(double));
cudaMemcpy(dev_mem,mem,(999999*140)*sizeof(double),cudaMemcpyHostToDevice);
GPU memory usage: used = 40.746338, free = 4055.066162 MB, total = 4095.812500 MB
mhm ok.
Solution for Tesla
void** mem;
cudaMalloc(&mem,1054437333*sizeof(char));
this worked on Tesla Device. But on my geforce 310M not.
Solution for geforce 310M without &
void** mem;
cudaMalloc(mem,1054437333*sizeof(char));