Does anyone know of an article online (or in a journal) that talks extensively about CUDA memory handling. I am lookiing for an analytical approach. The secret it seem to realizing all of the speed improvement is in handling of the memory.
I saw one article (which I did not keep) that said the secret to seeing the speed increases is to be intimately familiar with you program’s memory and how CUDA handles it.
newport_j