How to pass two flags to cudaHostAlloc()?

I want the page-locked memory to be portable and mapped in my multithread program.
so, can i do like this cudaHostAlloc((void **)&address,size,cudaHostAllocPortable|cudaHostAllocMapped) ?
but when I do the cudaHostAlloc in the main thread, and do cudaHostGetDevicePointer() in the children thread, I am failed.

by the way, i used a GTX295 with 2 GPU
Who know how to do that?
Thanks.

CUDA contexts are attached to a given thread, so you probably cannot do that in two different threads.

but the section 3.2.5.1 in the manual said that the portable page-locked memory can be shared and used by all the thread.

Because I will process many big data set which bigger than 512Mb or 1Gb, it is impossible to copy them to device or create some copies of them for each thread.

But the use of the cudaHostAllocPortable flag is to make the memory pinned in all contexts :)

Interestingly, this mention about error messages in the CUDA reference manual seems to indicate that you should be able to map memory allocated in a different thread:

(emphasis added)

I unfortunately don’t have access to a system with both a G200 board and CUDA 2.2, so I can’t test this out for myself.

Section 3.2.5.3 of the Programming guide states

So do not forget to call the

cudaSetDeviceFlags(cudaDeviceMapHost)

function!

Regards

Navier

you are right.

I only call cudaSetDeviceFlags in the children thread , but not in the main thread.

After i added it in the main thread, it work well.

Thank you .

Another questions is that the manual mentioned that the use of page-locked memory will influence the performance of the computer.

I know little about the page-locked memory how to work, I only feel if have enough memory, it doesn’t matter to use the page-locked memory.

Am I right?

if I have 8Gb or 16Gb memory, is there influence when i allocate about 4Gb page-locked memory?