help-Device memory allocation failed

I am running cuda on gforrce 8800 GT card and recently I executed the simpleCBLAS sample project with different sizes of matrices. when the matrix size is 6500*6500, the progam halts saying device memory allocation failed. Is this a problem in the program or is the device memory not enough to hold the matrices.



6500 * 6500 * 4 bytes per float * 3 matrices = 507MB. When you take into account the overhead due to the context (and potential memory fragmentation), yeah, you’re out of memory.