Issus with Peagable Memory under Win7 and Win10

Hallo,

i need, due to external hardware with no linux driver, cuda 8.0 on a at least GTX 1060 with the new “Pagable Memory” Support of Pascal and Cuda 8.0. On linux systems “Pagable Memory” is supported and my software runs corrected on synthetic data. On windows my software becoms slow and crashes due to leak of memory also on a GTX1080. A GTX1080Ti or Titan is not available and would also not have enough memory, so i need the pageable memory feature of the pascal architecture.

I got following systems with the same problem under windows but not linux.
Win7 64bit i7-920 VS2013/VS2015 GTX 1080 (Qt5.7 for unserinterface)
Win7 64bit i7-2700K VS2013/VS2015 GTX 1080 (Qt5.7 for unserinterface)
Win10 64bit i5-6600K VS2013/VS2015 GTX 1060 (Qt5.7 for unserinterface)

So runing programms on windows falls back to “ZeroCopyMemmory” behavior, which is much slover due to hold the hole unified memory data in the Hist space and copy them on kernel call to device memory.

In my case it is extremly slow because my data are casscaded so a call to the data can data depended lead to further calls. (Think of it like a list of pointers to datatsets)

I checked the device parameter with deviceQuereyDrv.exe ane deviceQuery.exe. The Paramters are the same under Linux where the “Pagable Meomory” feature works correctly

What i can exclude:
Zero CopyMemory behavior due to multiple GPUS => I don’t have multiple GPU setups.
IOMMU is deaktivated on the systems (Intels VT-d) because of known issus, see cuda documentation. (Unified memory is not currently supported with IOMMU. The workaround is to disable IOMMU in the BIOS. Please refer to the vendor documentation for the steps to disable it in the BIOS.
)

Any one had an idea what could be wrong?
Thanks for help.

Please excuse my english i’m not a native speaker.

Hi,

it is a BUG of NVIDIA on Windows Systems witch occurs with PASCAL architecture.

I know this since a few days, but could not write it here because i was on vacation without internet connection.

For details see the comments of: https://devblogs.nvidia.com/parallelforall/unified-memory-cuda-beginners/
where Mark Harris from NVIDIA confirms the Bug. It should be corrected with CUDA 9. He also tells that it should be communicat