Can I use 3600*sizeof(double) shared memory size inside a kernel which is having only one block and one thread?
so the code is…
global void testKernel()
shared real shCoordsY;
shared unsigned short int shNumPtnY;
I tested the above code on 8400GS card and Quadro CX card using CUDA2.2 Drivers, SDK.
The shared memory used in my code is
sizeof(unsigned short int) * 1800 = 4 * 1800 = 7200 and
sizeof(double) * 1800 = 8 * 1800 = 14k.
so total shared memory is = 7.2k + 14k = 21.2k
but the shared memory is available on 8400GS card is 8k and Quadro CX is 16k
My code exceeds the available shared memory on card, so I thought that it should give the compilation error, but
When I compil the code, its not giving any error and also no crash. running fine.
So is this code acceptable?