I have run into allocation problems with my program. I have a Geforce GT 8800. NVIDIA X Server settings window tells me I have 512mb onboard memory, which is correct. I installed 260.19.26 development drivers and I use Cuda toolkit 3.2. My application requires me to perform least squares on huge matrices of floats. For that I use the CULA library.
The current example I’m working with results in a matrix with 44 columns. As soon as I pass the number of about 1000000 rows, the program fails. I inserted asserts and they result in the following:
mem::Linear::Linear(const T*, size_t) [with T = float, size_t = long unsigned int]: Assertion `cudaMalloc(&_dev,_size*sizeof(T)) == cudaSuccess’ failed.
Linear is a template class I made as wrapper for linear memory on GPU. 44 x 1000000 x 4 = 176000000, which is only 176 mb? (assuming a float is 4 bytes).
What am I doing wrong? I have the feeling not all my memory is being used…