According to NVIDIA developer website, Pascal architecture (GTX 1070/1080) supports 2TB size of unified memory with CUDA 8.
I am wondering if this support is only for linux or any os.
If it is for all os including Windows, I would like to know what specific CUDA api need to be called for memory allocation.
Do you mean I can use UM with GTX 1070/1080 but can’t make UM with 2TB?
If I need UM with size of 2GB, can I use cudaMallocManaged() or need to use some other API?
Sorry, I think my previous comment was incorrect, I edited it.
I think you should be able to “oversubscribe” your GPU using cudaMallocManaged(). I don’t have a windows setup handy to test it.
I don’t know exactly how much oversubscription there can be. I recently ran a test on linux with a Titan X (Pascal) that has 8GB, where I oversubscribed it to 60GB in a box that had 64GB.
I don’t know about 2TB or what the actual upper limit is. Pascal GPUs have a 49bit TLB, so in theory they should be able to map more than 2TB, but I don’t know if there is a published upper limit or what it may be.
Perhaps you may just want to try it. Again, you may wish to read the UM section in the programming guide.