In-Kernel memcpy with different memory types & shared memory question

Hello,

Does the in kernel memcpy function work when pointers to different types of memory (global , local or shared ) are passed to it ?

I am also curious as to how the pointers are differentiated , lets say if I pass them to device functions.

I have never tried this but is it possible to allocate shared memory in a device function that is called by a global function ?

yes

pointers can be identified at compile-time, or be “generic”
pointers identified by the compiler as belonging to a specific space may use instructions specific to that space. Otherwise the generic pointers will be compiled to use instructions which can handle different spaces.

If you want to learn more about this, the PTX guide is a good resource:

https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#state-spaces-types-and-variables
https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#generic-addressing

Yes it is possible to (statically) allocate shared memory in a device function.

Thanks a lot txbob.