This is the Maxwell facts thread. Let’s end the speculation and rumours. It’s just one hour before the press embargo is lifted. Let’s start collecting deviceQuery dumps, performance numbers, benchmarks, instruction throughput figures etc. all in one place.
Also: what hardware features are new? How can we make use of them?
I’d love to have some information about how to make use of the ARM processor that is supposed to be inside this Maxwell chip. Is it only used internally by the driver to offload some things (like dynamic parallelism) or is it also accessible to the programmer?
A friend of mine will be getting 10. His first mining farm.
Because of some product expectations I have I will hold off buying single GPU cards. I’ve bought Asus MARS recently and I want this kind of device for mining, but definitely Maxwell based. We need more hash power density - up to the power limit that a single PCI express card can provide (would that be 250 Watts?)
Not directly related to Maxwell, but I’m pleased to see improved code generation in CUDA 6.0. After recompiling my image processing codes, the instruction count reduced by 12% and kernel time by 22% !
One thing I’ve always been bothered by is the very inefficient array indexing code. Unlike x86 which can compute
index * scale + offset + constOffset with a single load/store instruction, CUDA actually uses multiply and add instructions to do it (you can translate the array index into an induction variable, but that increases register use). 64 bit addressing makes it worse by doubling the # instructions.
It took me a while to realize why my simple code had 2 multiplies for each memory load:
We should know as soon as someone gets one and prints the device caps. It’s likely sm_35 or the (new) sm_32. It is not the sm_37 buried in the CUDA 6.0 headers (which provides more shared memory than the 64K GM108 is known to have).
Even GK208 is sm_35.
One (small) clue is from the GM107 white paper, which says “our first-generation Maxwell GPUs offer the same API functionality as Kepler GPUs”. That doesn’t tell us anything really except it’s sm_3x.