I am wondering what the Max Payload Size of the MWr and MRd Transaction Layer Packets is when running cudaMemcpy() between 2 GTX 1080 GPUs with Peer-to-Peer access through the Unified Virtual Addressing enabled
This is the type of low-level implementation-specific detail that is unlikely to be publicly documented anywhere. I have never seen any NVIDIA-provided documentation that spoke to such details.
If it is important to know this information for your use case, I would suggest hooking up a logic analyzer to examine PCIe traffic. That way you will know for sure. I would be curious to know for what purpose one would need this information, if you are allowed to share that.
If I remember comments correctly (that’s a big if!) that were made in these forums by someone looking into communication between an FPGA and a GPU (I don’t recall which type!), the packet size they observed being used by the GPU was 128 bytes. Again: I may misremember the number and I did not look into this myself.
Yes, I think its 128 bytes max payload.
It’s not user configurable or controllable.