Why A100 has the connection between two L2 partitions? Do they have different latency from L1 to L2?

I noticed there is a connection bw the two L2 partition on A100, so why the connections? are they 64B , if so , the left SMs accessing the right L2 will have very big latency? Path A always has long latency than Path B? how to program in CUDA?