Hi,
I’m trying to use cudaHostAlloc to allocate a page-locked host memory so as to get a speedup when transfer this to device memory and further for streaming.
cutilSafeCall(cudaHostAlloc( (void**)&db, MAX_DB_SIZE * sizeof(char), cudaHostAllocDefault ) );
The weird thing is at first, I did get it run and find that after the page-locked allocation, the data transfer(40MB) time drops from 150ms down to 0.25ms(amazingly huge improvement). But after several tries, I got error reported as
“cutilCheckMsg() CUTIL CUDA error : Kernel execution failed : setting the device when a process is active is not allowed.”
after the kernel check.
Anyone has some experience of this? Please help get me out of here… Thanks