When PCI-E is a bottleneck in ML?

I am currently using my two GPUs - Titans RTX (both work at 100% performance using full 24Gb of memory) for machine learning. I want to upgrade my motherboard so I could use 3-4 GPUs. There are some ways here - buy motherboards with additional PCI-E 3.0/4.0 in configuration like: x16/x8/x16/x8 or x16/x16/x16 or x16/x8/x8 etc. The obvious part is that more available lanes in motherboard and better CPU cost more. So I wanted to check if I really need those x16 or x8 is enough (let’s assume that this are PCI-E 3.0). Is there any way to check current data transfer of my GPUs via PCI-E?

I know the theoretical possibilities of PCI-E, but I would like to know to what extent I am currently using them during ML.