Cuda memory size and types allocating large chunks of memory

Hi all,

I’m fairly new to Cuda, so please bear with me. Here is my current dilemma:

I am trying to do a large number of calculations on a large amount of data. I have a 512x512 array of ints, and would like to return a 512x512x512 matrix of data.

Now, I can’t allocate an array that large on the GPU. I run out of memory.

I cudaMAlloc() 512512512*sizeof(float) and that gives me an out of memory error.

My current understanding tells me that this memory is global memory, which I also believe is the largest amount of memory on the device.

Is there a way I can allocate this amount of memory to store data in? Or do I have to change my method?

THanks for any help.

You could buy a GPU with more RAM… 512^3 floats are only about half a gig :)

You could buy a GPU with more RAM… 512^3 floats are only about half a gig :)

heheh alright if thats the deal, and my understanding of memory usage was correct, then that is fine.

Thanks for the reply.

heheh alright if thats the deal, and my understanding of memory usage was correct, then that is fine.

Thanks for the reply.

How about 10Giga? In that case, what’s the solution?

I hope 2 or more devices with SLI could solve this. (Is it called GPU clusters??)

How about 10Giga? In that case, what’s the solution?

I hope 2 or more devices with SLI could solve this. (Is it called GPU clusters??)

SLI does nothing for CUDA. You have to individually control each CUDA device. If you have more data than can fit in GPU memory, you have to partition the data and send a piece to each GPU or manually swap pieces in and out of GPU memory

SLI does nothing for CUDA. You have to individually control each CUDA device. If you have more data than can fit in GPU memory, you have to partition the data and send a piece to each GPU or manually swap pieces in and out of GPU memory

Thanks for the reply.

Thanks for the reply.