Small memory of Nvidia Tesla workstation and GPU

Why the memory of Nvidia Tesla is only 6GB in the best case?
The CPUs can have hundreds of gigabytes of memory, even 1TB of RAM.
Is it impossible to increase the memory to at least 100GB?

If you put a bunch of GPUs in the same box, your effective GPU memory available goes up. ArrayFire Pro makes working with these disparate memories really easy.

But, you’re right, I’m not aware of a way to mod the memory amount yourself.

The memory of a GPU is fixed, If you need more memory you can try to overlap the computation with copying and use the cpu memory as a buffer or have more GPU in which case your data will be distributed among other cards and you need to use mpi to handle the communications or use streams to switch the contexts. For 100GB you need 16 Tesla at least, CUDA and GPGPU are not a universal cure. Some problems just can not be put on gpu. For so much memory you might consider the Cray. By the way a Tesla card has 448 cores, but this does not mean that one would run 448 programs. It means that the job can be divided in at least 448 independent parts.

Well, I wonder why manufacturers didn’t made a version with a larger memory, just for people like me who want to use that.
I’m also looking for a cheap solution, I don’t know if a Cray server is good enough in terms of power and memory for the money I can spend on it. A tesla is only about $2000. To replace that computational power I think I need a system with a cost of about $20000 system.

I think it’s an architecture limitation. One GF110 chip is mated by the architecture to 12 GDDR5 memory modules in 32-bit mode or to 24 modules in 16-bit mode, for a total of 384-bit bandwidth. NVIDIA does not make these modules, it buys them from third-party suppliers. GDDR5 modules go as high as 2 Gbit (=256 MB) per module, no one makes larger modules, possibly because there’s little to no demand for those, and that gives you up to 6 GB of memory per GPU.

Accommodating more memory would require some serious messing with the memory controller design.

Also keep in mind that memory bandwidth of GPU memory is about 10× higher than CPU memory. This requires wide busses with short signal paths and doesn’t tolerate any connectors in between.

Your best option for a large memory system probably are a PCIe3 GPU and mainboard, and using mapped CPU memory. This gets you a memory bandwidth that is roughly 15× slower than GPU memory, but not that much slower than CPU memory bandwidth.