Kernel with no parameters Is it possible?

Hello,

I’m trying to build very specific kernel - it uses no parameters (they are passed via constant mem) and should return
only one int value, but it MUST use all available shared memory -16K

In C, I would write it like

int A (void)
{…
return (…);
}

I can write it on CUDA like

global void A (int * value)
{…
*value = …;
}

but this way I can’t use ALL shared memory - the kernel parameter also uses it.

So, is there any way to return just one int value from kernel to host without using the kernel
parameters? Maybe, the pinned memory can help? Or there is a simpler solution?

Thanks.

Well, you can simply have a global device variable:

__device__ int retParam;

__global__ void A() {

...

retParam = ...;

}

If you return several values from multiple threads you might want to use an array.

Check the Reference Manual on how to get values from global device variables to the host. You need cudaCopyFromSymbol or something like that.

Nevertheless you can not use all shared memory. There are always 16 bytes reserved for some internal stuff, like block and grid dimentions. You could try to get around it, some people tried (search the forum), but it is an ugly undocumented hack which may or may not work…

Thanks a lot. I’ve found it:
[url=“http://forums.nvidia.com/index.php?showtopic=97534”]http://forums.nvidia.com/index.php?showtopic=97534[/url]