Data transfer between Gpus

How can gpus communicate with each other in specified areas(blue point), such as the following two two-dimensional arrays :

Whether CUDA provides related functions ?

I’m missing context but maybe look at using NVSHEMM (NVSHMEM | NVIDIA)?

Or using NCCL for direct GPU to GPU transfers? NVIDIA Collective Communications Library (NCCL) | NVIDIA Developer