I want the page-locked memory to be portable and mapped in my multithread program.
so, can i do like this cudaHostAlloc((void **)&address,size,cudaHostAllocPortable|cudaHostAllocMapped) ?
but when I do the cudaHostAlloc in the main thread, and do cudaHostGetDevicePointer() in the children thread, I am failed.
by the way, i used a GTX295 with 2 GPU
Who know how to do that?