I have a problem concerning memory allocation on the card. my main program allocates memory on the card for my input data. it also copies the input data to the cards memory. the pointer I got from cudaMalloc (in the main program) is then passed as a parameter to a host-thread that starts the kernel on the device. the problem is, that the kernel reports an “invalid device pointer”. if I move the cudaMalloc stuff from the main program to the thread that launches the kernel, everything is fine.
are there any restrictions concerning cudaMalloc and different threads?
thanks in advance and best regards,