Zero copy mem access patterns

Romant · May 9, 2009, 12:13am

Tim, are there any strict rules on the way mapped host memory should be accessed from the device ?
It is not clear from the SDK example provided …

tmurray · May 9, 2009, 1:36am

What kind of rules do you expect? Don’t go out of bounds, don’t expect it to work cleanly if you are reading from one location and writing from another (CPU and GPU, GPU and other GPU, whatever)…

As far as performance goes, I think optimal access patterns will look a lot like GT200 coalescing.

Romant · May 9, 2009, 6:28am

May one thread access a bunch of data items without performance penalty ?

Something like this:

void Kernel(float* d_pMappedMemory)

{

	for (int i = 0; i < N; i++)

		fVar += d_pMappedMemory[base + thread.idx + i);

}

Under performance penalty I understand effects similar to ‘uncoalesced access’ issues.