Kernel construction with four inputs


when reading someone else’s code, I got one initialization with the kernel init code
kernel1<<<1, 256, 2048*32, 0>>>

Normally, the first two parameters are the number of blocks and threads, and sometimes using the third and fourth when using concurrent memory copy and execution such as
kernel2<<<1, 256, 0, stream2>>>

what does it mean in kernel1 construction?

That is dynamic shared memory size. See Programming Guide :: CUDA Toolkit Documentation

1 Like