why GPU memory size is small?

Hi, I just wonder why the available GPU memory is much smaller than the main memory?

Since GPUs are proposed to be used in scientific computing these years, several GBs are clearly not enough for many scientific applications. Most desktop GPUs only have around 1GB memory, even for computing-specific products, such as S2050, each processor only has 3GB. I think this limits the use of GPUs in real-world applications.

Anybody knows the reasons? This is because the technique bottleneck, or just NVIDIA does not want to make large GPU memory (maybe several GBs is enough for image processing)?

Thanks,
Mian

it will be a combination of price and what most people do with the cards, you can get tesla cards with 4gig of memory in them, if that isnt enough then buy a second, then you have 8gig of RAM at your disposal.

several GB’s clearly IS enough at the moment per card and CUDA continues to be used in many scientific environments successfully.

there are servers available with multiple cards in them with 16gig of memory if you really need it but they cost a very large amount of money.

It’s because of the ultra-high bandwidth the memory has to provide. You cannot just add more memory chips because the increased electrical capacity (and decreased resistance) would spoil the timing on the memory bus.

I can see there are still a large number of scientific applications requiring very large memory. yes, actually this is also my point, I may guess the price limites the use of GPUs with large memory (but i do not know why GPU memoyr is more expensive, and where is the technique bottleneck).

BTW, can you give me more information about the 16GB card, please? thanks a lot!

GPUs also tend to use more expensive memory technologies than CPUs because the GPU demands a very high memory bandwidth. As an example, the GTX 480 has a 384 bit wide bus, yet achieves a peak transfer rate of 177.4 GB/sec. In comparison, an Intel Core i7 CPU with a 192 bit wide bus (in the triple channel configuration) achieves a peak transfer rate of something like 25-40 GB/sec, depending on the speed of DDR3 you install. Correcting for the differing bus width, that means the GPU memory technology (GDDR5 in this case) needs to move data across each bus line at more than twice the rate of DDR3 memory.

this makes sense to me :) the bandwidth is much higher than the main memory.

thank you very much.

As a side note, most of the large supercomputers listed on TeraGrid weigh in at about 2 GiB of RAM per (CPU) core. Put in these terms, the GPU memory ceilings aren’t so low.

Even if there is not enough memory for large data sets for an application, I would imagine that most of the time you can tile the input or output so that the current working set does fit in the 1-6GB that the card has. You might even be able to use double buffering, streams, and asynchronous I/O in order to overlap data transfer on one buffer with execution on the other buffer…

sure, that is true, you always can use such methods :)

You already have cards with 6GB and 9GB coming supposedly near the end of the year. The problems are probably, price, availability and difficulty of putting a lot of ram on the card.

more than 4GB was only possible technically with Fermi anyway as it requires 64bit addressing.