Can I limit page size for unified memory?


I am transferring data from GPU1 to GPU2 using a unified (int) array T[nrblocks * 10240]. Each block in the grid writes data at &T[blockIdx.x * 10240]. For certain reasons I need to set the length for each block to 10240.

Some facts:

  1. not all blocks will write to T at the same time
  2. most of the times GPU1 only writes hundreds of data in T
  3. GPU2 will clear the slot when it used the data stored in there
  4. The idea of using unified memory is that I don’t need to transfer whole array from GPU1 to GPU2

Pages faults happen when GPU2 is reading data, the problem is that the page size often are 64KB or more. Because each block has 10240 slots, which is 10240 * 4 / 1024 = 40KB, when page size > 40KB, empty slots supposed to be in GPU1 are moved to GPU2. Then when the corresponding block in GPU1 tries to write data, pages fault will occur. And if the page size > 40KB again, there could be another page fault… And nvprof tells me the max page faults size is 2MB…

So I need to limit the page size to 40KB to reduce the number of page faults.

Is there a way to set a limit to page size for a given value? Or any other solutions?

Thanks in advance!

Page sizes need hardware support and operating system support so you usually can’t set them to any value. 4KB 64KB and 4MB re sizes that I know about being supported by for example Linux.