As there is long network latency over our roce network, we want to change ack timeout value for the qp. CX5 seems to have default timeout value of 19. But we find CX5 resend packets in about 1ms interval before it receives ack. it seems the timeout value is less than 19.
we do the following experiments:
1At start, we run ib_send_bw with -u timeout=25. but CX5 still resend packets in about 1ms before receiving ack.
2 we then force attr->timeout to 25 in the following function code. but CX5 still resend packets in about 1ms before receiving ack.
Is there any bug in our code? how can we know the ack timeout is modified correctly?
static int ctx_modify_qp_to_rts(struct ibv_qp *qp,
struct ibv_qp_attr *attr,
struct perftest_parameters *user_param,
struct pingpong_dest *dest,
struct pingpong_dest *my_dest)
{
int flags = IBV_QP_STATE;
attr->qp_state = IBV_QPS_RTS;
if (user_param->connection_type != RawEth) {
flags |= IBV_QP_SQ_PSN;
attr->sq_psn = my_dest->psn;
if (user_param->connection_type == DC ||
user_param->connection_type == RC ||
user_param->connection_type == XRC) {
attr->timeout = user_param->qp_timeout;
attr->retry_cnt = 7;
attr->rnr_retry = 7;
attr->max_rd_atomic = dest->out_reads;
flags |= (IBV_QP_TIMEOUT | IBV_QP_RETRY_CNT | IBV_QP_RNR_RETRY | IBV_QP_MAX_QP_RD_ATOMIC);
}
}
#ifdef HAVE_PACKET_PACING
if (user_param->rate_limit_type == PP_RATE_LIMIT) {
attr->rate_limit = user_param->rate_limit;
flags |= IBV_QP_RATE_LIMIT;
}
#endif
return ibv_modify_qp(qp, attr, flags);
}