It seems that the driver sent to the firmware buffer length parameter insufficient to establish the QP context.
This situation can arise if there is a compatibility issue between the driver and firmware versions. To verify this, please check the Release Notes of the driver, available at: Linux InfiniBand Drivers.
Additionally, improper parameter handling can also lead to this problem.
Thanks @chenh1, we had indeed a firmware issue on some of our nodes. We repeated the test on two Dell PowerEdge systems that are properly configured (driver 5.6-2.0.9 and firmware 16.32.1010) and now dmesg only shows:
[4648076.498624] infiniband mlx5_0: create_qp:3192:(pid 4166615): Create QP type 4098 failed
But there is still a problem when creating the QP for the DCT. You mentioned improper parameter handling. Does this mean device/driver configuration or user code? Because the code was taken from NVIDIA documentation.