Say I have a pointer to device memory and I want to use cuMemcpyDtoD. In particular the source region covers the back-most two-third of the memory region and I want to copy that to the beginning of the entire region. In other words source and target memory overlap. Can I rely on cuMemcpyDtoD doing this right?
no
I didn’t find a formal statement for cuMemcpyDtoD but a formal statement is given here:
“The memory areas may not overlap.”
You could perform multiple copies of (srcptr - destptr) elements one after another.
something like:
int n = 30;
int* array;
...
//want to copy array[10 - 29] to the front
cudaMemcpy(array, array + 10, sizeof(int) * 10, D2D);
cudaMemcpy(array + 10, array + 20, sizeof(int) * 10, D2D);