The first two parameters are dimensions of grid and block respectively. But what are other parameters? Is there any documents containing such information?
I downloaded a CUDA code and found the author use three parameters when calling a kernel function. The third parameter seems to be related to memory allocation. I want to know details about it. Within the kernel function, there is a variable declared as “extern shared unsigned int”. But it is never allocated and I cannot find other places it is declared. But the code is compiled and run correctly. Can anyone explain it to me? Thanks a lot!
You can check out section B.12 and B.2.3 of the Programming Guide. Basically, when a shared variable is declared with extern, it will be allocated dynamically through the 3rd parameter of the << … >>.
Not quite, there can be multiple extern variables, but they all will be started on the same address, though one can use offsets to divide the memory into structures For example, if both kernel A and B wants to use 1 unsigned int. They can both declare “extern shared unsigned int *U;”, but A uses U[0] while B use U[1]. I think these are also documented in B.2.3.