Say I have a pointer to device memory and I want to use cuMemcpyDtoD. In particular the source region covers the back-most two-third of the memory region and I want to copy that to the beginning of the entire region. In other words source and target memory overlap. Can I rely on cuMemcpyDtoD doing this right?
I didn’t find a formal statement for cuMemcpyDtoD but a formal statement is given here:
“The memory areas may not overlap.”
You could perform multiple copies of (srcptr - destptr) elements one after another.
int n = 30; int* array; ... //want to copy array[10 - 29] to the front cudaMemcpy(array, array + 10, sizeof(int) * 10, D2D); cudaMemcpy(array + 10, array + 20, sizeof(int) * 10, D2D);