The explanation of cudaMemcpyToArray in the CUDA reference manual fails to answer a couple of important questions. I’ve copied and pasted it here:
My questions are:
What are the units of dstX and dstY (I think that dstX is in bytes and dstY is just the row number, but I would appreciate it if someone could confirm that).
I think that the copy happens in row major order. Is that correct?
In other words, a call to cudaMemcpyToArray will copy count bytes into row dstY, starting dstX bytes from the beginning of the row. Is this correct?