For example, when calling ncclAllReduce in the framework layer, the data passed in is a message, while ncclIbIsend is called in the underlying layer to send the data, which is associated with QP. The intermediate layer also involves the concepts of Channel and Chunk. So what is the relationship between message, channel, chunk, and QP? How do they affect communication performance respectively?
if anything wrong, please correct me, thanks