Usage of Unified Memory on R28.1 vs R24.2.1

Hi Guys,

I have built a CUDA-based application which uses Unified Memory to write the output of a kernel. Once the output is written I try to access the same on CPU.

When I try to achieve the above task by using cudaMallocManaged, it works well on R24.2.1 ( TX1 platform) but seems to give “Bus Error” on R28.1 ( TX2 platform) and the pointer says “Cannot be accessed” right after the kernel call.

Kindly let me know where I could be making a mistake. Please help me out.



Usually, Bus Error occurs when CPU/GPU concurrent accesses an unified memory
Please remember to call cudaDeviceSynchronize() before accessing with CPU.

If this issue goes on, could you provide an example with us to reproduce?