My code structure is like below:
int *arg1, *var1, *arg2;
cudaStream_t cstream1;
cudaStreamCreate(&cstream1);
cudaKernel <<< gridsize, blocksize, 0, cstream1 >>>(arg1, arg2);
cudaMemcpyAsync( var1, arg2, size, cstream1);
But this gives run time cuda Error : invalid argument.
If I use cudaStreamSynchronize(cstream1); after cuda kernel then it also give same error.
But if I use cudaStreamSynchronize(cstream1); and then copy the memory using cudaMemcpy() it runs.
Why???
I’m using 9400M.
Please help.
DSCH
2
Your are calling the memcpy-function in the wrong way. The kind of direction is missing. Try
cudaMemcpyAsync( var1, arg2, size, direction, cstream1);
with direction = cudaMemcpyHostToDevice or cudaMemcpyDeviceToHost.
regards!