Dear all and nVidia experts,
I’ve come across a weird issue lately with pinned memory. It is said in CUDA manual and project bandwidthTest in CUDA 2.1 SDK that by using pinned memory, we will have more bandwidth between host and device. It proved to be true for most of the systems (2.5GB/s for one-way host and device). Although something odd happened when I’m deploying my application to the server.
The server is equipped with Geforce 8600 GT (yes, 8600 GT for a server, but you know…), and the driver is the latest 182.50. I tested using CUDA SDK bandwidthTest and the result is transmission between host and device are all around 1.3GB/s, which is the same (or even lower than pageable memory). So I wonder whether my system has limited the size of pagelock memory (which can be done under Linux using ulimit, or previous Windows versions using registry). If that’s the case, how can I tune up the size for pinned memory so I can use it for faster I/O between host and device. If that’s not the case, then what possibility have I done wrong. Thanks!
BTW, for most workstations, the bandwidth is 2.5GB/s, but only for my servers (Windows XP), pinned memory doesn’t work. Thanks a lot!