Can I reduce all my memory copy routines to cuMemcpy3D


I want to get rid of the many different memory copy call in my CUDA wrappers to make things a little bit easier to understand.
Therefore I want to use the cuMemcpy3D for everything which has to do with memory copies, because it seems that this method is the
most generic method for copying data. Is this true, are there any performance drawbacks compared to other copy routines?


I’m still interested in this question. Any help or suggestions on this?

Still no comment on this???