What's the request of using mapped memory?


In my case, some data needs to transfer to CPU only once.
Can I use mapped memory to improve the performance instead of using memory copy from GPU to CPU?
If yes, what’s the hardware, SDK and driver I need? It seems CUDA3.0 and GTX285 does not support this feature。


No, you’ll get better transfer performance just using cudamemcpy. Zero-copy isn’t any faster, and in fact it will be slower since it needs two-way communication (the device needs to send a request to the host, and then the data goes back to the device.)

There may be some work-balance net time benefit exceptions though in that zero-copy may help you DELAY a memory copy… and that may be a computational win if you can hide the memory transfer latency and bandwidth by mixing threads with both computation and memory access. So it’s problem dependent.

And in answer to your question, you need toolkit 2.2 or later, and a GPU of device 1.2 or later. You’re fine with the GTX285 and toolkit 3.0… it DOES support zero-copy.