I’m trying to build very specific kernel - it uses no parameters (they are passed via constant mem) and should return
only one int value, but it MUST use all available shared memory -16K
In C, I would write it like
int A (void)
I can write it on CUDA like
global void A (int * value)
*value = …;
but this way I can’t use ALL shared memory - the kernel parameter also uses it.
So, is there any way to return just one int value from kernel to host without using the kernel
parameters? Maybe, the pinned memory can help? Or there is a simpler solution?