PCIe slots local to CPU for each GPU

With the HPE DL380 Gen10 PCIe slot 1 and 2 are local to each corresponding CPU socket. The NVIDIA documentation states that performance can be improved if you pin CPU to GPU on the same bus. My understanding is that VMware schedulers do not understand PCIe locality so pinning CPU to GPU is a must if you want to go this route.

Has anyone tested the performance gains you would expect in a VDI type environment by pinning CPU and GPU on the same PCIe bus?