I have several questions related to memory allocation on device and CUDA contexts.
I have an application using two CPU treads. One of them is used to allocate several arrays on the GPU before the other requires accessing any of these arrays. I’m getting an invalid device pointer error. I’m guessing that it is related to the one-to-one correspondence between CPU treads and CUDA context.
- Is this correct?
- Is there a way to avoid this problem without changing the architecture of my solution?
If my application has several CPU threads executing the same CUDA code:
- Will it create automatically the same number of CUDA context than CPU threads?
- If this doesn’t happen, how can it be done? Should I use cuCtxCreate()? Is there an example of Context Management included in the CUDA SDK?
Thanks a lot,