CUDA unified memory oversubscription in Windows systems

I am exploring using Unified Memory (UM) for my CUDA application. The main reason is to allow running applications that oversubscribe device memory.

Initially I though that this feature was restricted just to the most recent GPU cards of the Pascal family. Older cards used the standard physical device memory allocation method. Nevertheless I have realized that the memory oversubscription feature is apparently restricted to LINUX systems. I work on Windows systems and therefore this is a fundamental limitation which really reduces the appeal to use UM.

Is there any fundamental reason why UM with memory oversubscription is not supported on Windows systems? Is it something that is going to be supported at some point in the future? If this feature is not going to be supported on Windows systems, what would be the best way of dealing with device memory oversubscription?

As I know it is due VDDM. If you use TCC mode, you can use memory oversubscription under windows too. In this case you need another solution (vga) for display(s). I have not tested it.

https://docs.nvidia.com/gameworks/content/developertools/desktop/nsight/tesla_compute_cluster.htm

As OP indicates, it’s actually not supported in current CUDA versions (9.0, 9.1). TCC doesn’t change this. This is discussed in the programming guide.

Under CUDA 9/9.1, the windows managed memory (UM) subsystem behaves similarly to the pre-Pascal UM regime.