There is no RX counter structure in PCC framework.
PCC algorithm handles events on TX and RX side:
for TX side, there will be TX events, while TX event conveys TX counter which records how much data is sent.
for RX side, there will be events upon receiving RTT, ACK, NACK, CNP packets, while the corresponding events conveys info specific to RTT, ACK, NACK, CNP packets. There is no RX events indicating how much data is received.
I could understand why RX counters are not required for PCC.
Now I am running some NCCL applications, and I wonder if PCC is running per-flow or per-qp (or perhaps per-NIC).
For example, if when NCCL is running and I do /opt/mellanox/iproute2/sbin/rdma resource show qp , it shows below qps (lqpn 2580 - 2587 are connected to node1, lqpn 2588 - 2595 are connected to node2),
link mlx5_0/1 lqpn 2580 rqpn 2738 type RC state RTS rq-psn 11602 sq-psn 11864397 path-mig-state MIGRATED pdn 368 pid 2786002 comm python
link mlx5_0/1 lqpn 2581 rqpn 2743 type RC state RTS rq-psn 11602 sq-psn 11864556 path-mig-state MIGRATED pdn 368 pid 2786002 comm python
link mlx5_0/1 lqpn 2582 rqpn 2740 type RC state RTS rq-psn 11865315 sq-psn 11603 path-mig-state MIGRATED pdn 369 pid 2786003 comm python
link mlx5_0/1 lqpn 2583 rqpn 2741 type RC state RTS rq-psn 11865477 sq-psn 11603 path-mig-state MIGRATED pdn 369 pid 2786003 comm python
link mlx5_0/1 lqpn 2584 rqpn 2745 type RC state RTS rq-psn 0 sq-psn 0 path-mig-state MIGRATED pdn 369 pid 2786003 comm python
link mlx5_0/1 lqpn 2585 rqpn 2747 type RC state RTS rq-psn 0 sq-psn 0 path-mig-state MIGRATED pdn 369 pid 2786003 comm python
link mlx5_0/1 lqpn 2586 rqpn 2744 type RC state RTS rq-psn 0 sq-psn 0 path-mig-state MIGRATED pdn 369 pid 2786003 comm python
link mlx5_0/1 lqpn 2587 rqpn 2746 type RC state RTS rq-psn 0 sq-psn 0 path-mig-state MIGRATED pdn 369 pid 2786003 comm python
link mlx5_0/1 lqpn 2588 rqpn 2909 type RC state RTS rq-psn 1224 sq-psn 622592 path-mig-state MIGRATED pdn 369 pid 2786118 comm python
link mlx5_0/1 lqpn 2589 rqpn 2912 type RC state RTS rq-psn 1224 sq-psn 621728 path-mig-state MIGRATED pdn 369 pid 2786118 comm python
link mlx5_0/1 lqpn 2590 rqpn 2906 type RC state RTS rq-psn 622592 sq-psn 1216 path-mig-state MIGRATED pdn 369 pid 2786118 comm python
link mlx5_0/1 lqpn 2591 rqpn 2911 type RC state RTS rq-psn 1224 sq-psn 622592 path-mig-state MIGRATED pdn 368 pid 2786119 comm python
link mlx5_0/1 lqpn 2592 rqpn 2914 type RC state RTS rq-psn 1224 sq-psn 621728 path-mig-state MIGRATED pdn 368 pid 2786119 comm python
link mlx5_0/1 lqpn 2593 rqpn 2907 type RC state RTS rq-psn 621728 sq-psn 1216 path-mig-state MIGRATED pdn 369 pid 2786118 comm python
link mlx5_0/1 lqpn 2594 rqpn 2910 type RC state RTS rq-psn 622592 sq-psn 1216 path-mig-state MIGRATED pdn 368 pid 2786119 comm python
link mlx5_0/1 lqpn 2595 rqpn 2913 type RC state RTS rq-psn 621728 sq-psn 1216 path-mig-state MIGRATED pdn 368 pid 2786119 comm python
link mlx5_0/1 lqpn 1 type GSI state RTS sq-psn 58 comm [ib_core]
link mlx5_1/1 lqpn 1 type GSI state RTS sq-psn 0 comm [ib_core]
How many different flowtag numbers should there be? I’m asking because I can only see one flowtag with fwtrace (But I guess I should see at least two different flowtags as it is sending to two different nodes).