CUDA + UCX: buffer size, buffer persistence

I am writing some cuda-aware routines with OpenMPI+UCX+CUDA and, after reading some documentations and UCX FAQ online, I didn’t find some details and maybe you guys know. If it is not the correct place to ask the question, let me know and I jump to the UCX dev forums (mostly issue report):

  • When copying data from a host’s memory to another host’s pinned memory, does it write directly to this memory space or first it goes to a memory buffer? If so, what is the size of this buffer?
  • If it uses a buffer, what is the persistence of this buffer in memory? I mean, is it deallocated as soon as the transfer finishes?
  • When running ucx_info -d on a Tegra, should it show any CUDA info in the output or only on dGPU?