Is cudaMallocmanaged with cudaMemAttachHost flag is faster than malloc?

Hi,
I am devoloping an application on Jetson Tk1. I have created an array using cudaMallocmanaged with cudaMemAttachHost flag. If I use that array in computation, It is faster compared if use an array allocated with normal malloc. I am wondering what could be the reason behavior. Please some one help me to understand this.

Thanks
sivaramakrishna

Hi sivaramakrishna,

Not sure what’s your use case, but since the cudaMallocmanaged is page-locked, it’s as opposed to regular pageable host memory allocated by malloc(), to use page-locked host memory has several benefits, you could refer to http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#page-locked-host-memory

Thanks