I am new to CUDA 2.3 and was wondering if someone out there could give me a quick code example using shared memory?
Also, I know shared memory can only be accessed by threads within a single block, so is there a way to allow for read/write access by all threads in all blocks without communication with host? Maybe a global memory access example?
Thank you in advance.