Hi, I’m facing an issue with shared memory and multiple kernels and I’d like to have more information about how it works.
Is it possible for a kernel to read (by mistake) outside of its own shared memory “reserved” location and read some other data written by another kernel on its own shared memory?
Of course both kernels need to run on same SM at “same time”.
I was hoping that reading OUT of shared memory would cause a CUDA error, but is it?
What about writing OUT of shared memory? Is it raising an error?
Thank you!
p.s. if I decrease the number of CUDA blocks of my two kernels, they work. Probably because the are mapped on different SM.
It is illegal behavior (in C or C++) to read or write outside of a proper allocation.
You should not depend on any behavior related to that. The data you retrieve might be anything, and the machine may or may not generate a fault.
The GPU provides isolation between non-MPS separate processes. It does not provide address space isolation between kernels launched from the same process.
What I have observed is that if I read outside of an allocation by a small amount, say a few bytes beyond the end of an allocated array, then there is no machine fault. I’ve never tried to discover what data is being read, if any. If I read at some large distance from any allocation, then I generally do observe a machine fault. But you should consider this as anecdotal, not something to expect or rely on.
Thank you!
At the end we find out my problem was caused by a missing initialization to zero of a shared memory variable, sometimes randomly another kernel was using and writing that exact shared memory location and without a proper set to zero the final result was wrong.