I was wondering what would be the most efficient way to perform memory swapping for two arrays stored on the device.
For example, I have two large vectors “p” and “pz” and I wish to swap their contents, such that p=pz and pz=p.
Of course this could be accomplished with an intermediate temporary vector using cudaMemcpy, however being at the edge of my device’s memory capacity this would require copying stuff back to the host which is detrimental to the performance of my algorithm.
Is there a command to explicitly perform memory swapping on the device?
CUBLAS provides vector swap functions called cublas{S|D|C|Z}swap(), so if your vectors are made up of floats, doubles, or complex numbers you could simply invoke those.