Pinned arrays present performance degradation on host functions

Hi, I’m porting an application using async memory copies and kernels to offload calculations. I notice some performance degradations on host routines related to IO and MPI wrappers (Cuda Fortran) using the pinned arrays. Is there some report or documentation about that?