“the cuda kernel does not allocate new memory or copy memory from host.”
what does the kernel do then, where does it get data from to do this, and what does it do with the result?
what are the grid/ block dimensions of the kernel?
“the cuda kernel does not allocate new memory or copy memory from host.”
what does the kernel do then, where does it get data from to do this, and what does it do with the result?
what are the grid/ block dimensions of the kernel?