Information about Cuda Memory Consumption on TK1, problem with cudaMemGetInfo()

I have noticed that cudaMemGetInfo() does not provide helpful information when trying to analyze memory consumption on TK1:
the reported free memory does not go down if memory gets allocated with any of the functions cudaMalloc, cudaMallocHost, cudaMallocManaged. The reported free memory does go down when memory gets allocated with plain malloc.

I noticed that I can track CUDA memory allocations via parsing /proc/self/maps and counting allocations that map /dev/nvmap. Using that method I discovered that all CUDA allocations are at least 1 MiB in size. I could not find this behavior documented, I did expect allocations to be aligned to the 4 KiB page size, not 256 pages.

  • Is this behavior configurable?
  • Is there a technical reason for the specific value of 1 MiB?

I also checked that two consecutive allocations of 256 KiB result in two different allocation of 1 MiB each, the unused memory of the first allocation does not get used for second call to (e.g.) cudaMalloc.

I’ve run my code on Jetson TX2 as well, there the free memory reported by cudaMemGetInfo() does go down, allocations can be found as mappings of anon_inode:dmabuf instead of /dev/nvmap, and they can get re-used for successive calls. Most importantly, the TX2 system has a lot more memory in total, so I don’t need to optimize memory consumption as much.


No. The page size cannot be customized.