I would like to work on n dimensional data with cuda implementting a simple laplacian mask.
As the texture cache is 3dimension max, I use 1 dimensional arranged data and shared memory to store neighboors values.
However, quoting this thread http://forums.nvidia.com/index.php?showtopic=192103
It’s obvious Ndimensional data cannot be arranged in a manner neigboors of a thread are in nearby refions in memory.
So What’s the best option to do that?