"Failed to install srptools DEB" during installation of MLNX_OFED_LINUX-5.1-0.6.6.0-ubuntu18.04-x86_64

Hello Mellanox community,

We have bought MT4119 ConnectX5 cards and we try to reinstall the last version of MLNX_OFED driver on our ubuntu 18.04-x86_64 servers. It works on 3 servers but on the last one, the installation failed because of “srptools DEB” installation fail.

Here is the message on the shell:

"clement@aotearoa:/opt/MLNX_OFED_LINUX-5.1-0.6.6.0-ubuntu18.04-x86_64$ sudo ./mlnxofedinstall --enable-opensm

Logs dir: /tmp/MLNX_OFED_LINUX.9973.logs

General log file: /tmp/MLNX_OFED_LINUX.9973.logs/general.log

Below is the list of MLNX_OFED_LINUX packages that you have chosen

(some may have been added by the installer due to package dependencies):

ofed-scripts

[…]

srptools

mlnx-ethtool

mlnx-iproute2

neohost-backend

neohost-sdk

This program will install the MLNX_OFED_LINUX package on your machine.

Note that all other Mellanox, OEM, OFED, RDMA or Distribution IB packages will be removed.

Those packages are removed due to conflicts with MLNX_OFED_LINUX, do not reinstall them.

Do you want to continue?[y/N]:y

Checking SW Requirements…

Removing old packages…

Uninstalling the previous version of MLNX_OFED_LINUX

Installing new packages

Installing ofed-scripts-5.1…

[…]

Installing dpcp-1.0.0…

Installing srptools-51mlnx1…

Failed to install srptools DEB

Collecting debug info…

See /tmp/MLNX_OFED_LINUX.9973.logs/srptools.debinstall.log"

and here is the message inside “/tmp/MLNX_OFED_LINUX.9973.logs/srptools.debinstall.log”:

/usr/bin/dpkg -i --force-confmiss /opt/MLNX_OFED_LINUX-5.1-0.6.6.0-ubuntu18.04-x86_64/DEBS/srptools_51mlnx1-1.51066_amd64.deb

Selecting previously unselected package srptools.

(Reading database … 277605 files and directories currently installed.)

Preparing to unpack …/srptools_51mlnx1-1.51066_amd64.deb …

Unpacking srptools (51mlnx1-1.51066) …

Setting up srptools (51mlnx1-1.51066) …

Configuration file ‘/etc/default/srptools’, does not exist on system.

Installing new config file as you requested.

Configuration file ‘/etc/init.d/srptools’, does not exist on system.

Installing new config file as you requested.

Configuration file ‘/etc/rdma/modules/srp_daemon.conf’, does not exist on system.

Installing new config file as you requested.

Configuration file ‘/etc/srp_daemon.conf’, does not exist on system.

Installing new config file as you requested.

Created symlink /etc/systemd/system/remote-fs-pre.target.wants/srp_daemon.service → /lib/systemd/system/srp_daemon.service.

A dependency job for srp_daemon.service failed. See ‘journalctl -xe’ for details.

A dependency job for srp_daemon.service failed. See ‘journalctl -xe’ for details.

invoke-rc.d: initscript srptools, action “start” failed.

  • srp_daemon.service - Daemon that discovers and logs in to SRP target systems

Loaded: loaded (/lib/systemd/system/srp_daemon.service; enabled; vendor preset: enabled)

Active: inactive (dead)

Docs: man:srp_daemon

file:/etc/srp_daemon.conf

sept. 14 18:14:44 aotearoa systemd[1]: Starting Daemon that discovers and logs in to SRP target systems…

sept. 14 18:14:44 aotearoa systemd[1]: Started Daemon that discovers and logs in to SRP target systems.

[…]

dpkg: error processing package srptools (–install):

installed srptools package post-installation script subprocess returned error exit status 1

Processing triggers for systemd (237-3ubuntu10.42) …

Processing triggers for ureadahead (0.100.0-21) …

Processing triggers for man-db (2.8.3-2ubuntu0.1) …

Errors were encountered while processing:

srptools

---------------- START OF DEBUG INFO -------------------

Install command: ./mlnxofedinstall --enable-opensm

Vars dump:

  • ofedlogs: /tmp/MLNX_OFED_LINUX.9973.logs

  • MLNX_OFED_LINUX_VERSION: 5.1-0.6.6.0

  • MLNX_OFED_ARCH: x86_64

  • MLNX_OFED_DISTRO: ubuntu18.04

  • distro: ubuntu18.04

  • arch: x86_64

  • kernel: 5.4.0-42-generic

  • config: /tmp/ofed.conf

  • update_firmware: 1"

I checked that all the required package were installed (specified by the released notes) but the dpkg package gives me that error message:

"sudo apt-get install dpkg

[…]

Setting up srptools (51mlnx1-1.51066) …

A dependency job for srp_daemon.service failed. See ‘journalctl -xe’ for details.

A dependency job for srp_daemon.service failed. See ‘journalctl -xe’ for details.

invoke-rc.d: initscript srptools, action “start” failed.

● srp_daemon.service - Daemon that discovers and logs in to SRP target systems

Loaded: loaded (/lib/systemd/system/srp_daemon.service; enabled; vendor preset: enabled)

Active: inactive (dead)

Docs: man:srp_daemon

file:/etc/srp_daemon.conf

sept. 14 18:17:47 aotearoa systemd[1]: Started Daemon that discovers and logs in to SRP target systems.

sept. 14 18:19:35 aotearoa systemd[1]: Stopped Daemon that discovers and logs in to SRP target systems.

[…]

dpkg: error processing package srptools (–configure):

installed srptools package post-installation script subprocess returned error exit status 1

Errors were encountered while processing:

srptools

E: Sub-process /usr/bin/dpkg returned an error code (1)"

I understand that the problems consists in “srptools package post-installation” but it seems that the installation of the driver MLNX5.1 induces this error that I can’t resolve.

Do you have any idea to solve that problem?

Thank you for your help.

Clément

Hi Clément,

srptools package from MLNX_OFED driver is a different version than what’s available from the Ubuntu repos, so there could be some conflicts.

If you don’t need this package (don’t require the discovery and use of SCSI devices over RDMA), you can try to install the driver without srptools:

./mlnxofedinstall --force --without-srptools

If there aren’t any active SRP connections to remote targets, it’s safe to omit this package entirely.

If you do need this package, you can try to install the srptools package manually (via dpkg -i) - it can be found in the ./DEBS/UPSTREAM_LIBS/ directory of the MLNX_OFED installation directory.

Regards,

Chen

Hello Chen,

thanks a lot for your quick answer!

I just have installes mlnx driver like that and it worked!

sudo ./mlnxofedinstall --force --without-srptools --enable-opensm

Nevertheless I don’t understand exactly if I need srptools package. I use mlnx with ConnectX5 cards with simple infiniband cable in order to run simulations (CFD openfoam) on 2 servers (instead of one). Does this task require srptools?

because after having launched mlnx:

  • opensm &
  • /etc/init.d/opensmd start
  • mst start

And then after “#ibhosts” I get this message:

  • ibwarn: [4495] mad_rpc_open_port: can’t open UMAD port ((null):0) /var/tmp/rdma-core/rdma-core-51mlnx1/libibnetdisc/ibnetdisc.c:802; can’t open MAD port ((null):0) /usr/sbin/ibnetdiscover: iberror: failed: discover failed

Is it related to srptools and the use of SCSI devices over RDMA?

In the case I really need srptools package; should I use that command

sudo dpkg -i /opt/MLNX_OFED_LINUX-5.1-0.6.6.0-ubuntu18.04-x86_64/DEBS/srptools_51mlnx1-1.51066_amd64.deb ?

Best regards,

Clément

Seeing the same on x86_64/amd64, and not on aarch64. Worked around it, but trying also to understand why it fails.

root@n054:~/Downloads/Mellanox/MLNX_OFED_LINUX/MLNX_OFED_LINUX-5.1-0.6.6.0-ubuntu18.04-x86_64# diff -auT mlnxofedinstall.ORIG mlnxofedinstall

— mlnxofedinstall.ORIG 2020-07-27 20:44:16.000000000 +0200

+++ mlnxofedinstall 2020-09-16 16:57:50.147348383 +0200

@@ -708,7 +708,7 @@

“libvma”, “libvma-utils”, “libvma-dev”,

“dpcp”,

“sockperf”,

  • “srptools”,
  • “srptools”,

“mlnx-ethtool”,

“mlnx-iproute2”,

“libsdp1”, “libsdp-dev”,

On arch64, all is good.

root@n009:~# dpkg -l | grep srptools

ii srptools 50mlnx1-1.50218 arm64 Tools for Infiniband attached storage (SRP)

root@n009:~# uname -ar

Linux n009 4.15.0-117-generic #118-Ubuntu SMP Fri Sep 4 20:05:59 UTC 2020 aarch64 aarch64 aarch64 GNU/Linux

root@n009:~# dpkg -l | grep srptools

ii srptools 50mlnx1-1.50218 arm64 Tools for Infiniband attached storage (SRP)

root@n009:~# cat /etc/srp_daemon.conf

This is an example rules configuration file for srp_daemon.

#This is a comment

disallow the following dgid

#d dgid=fe800000000000000002c90200402bd5

allow target with the following ioc_guid

#a ioc_guid=00a0b80200402bd7

allow target with the following pkey

#a pkey=ffff

allow target with the following id_ext and ioc_guid

#a id_ext=200500A0B81146A1,ioc_guid=00a0b80200402bef

disallow all the rest

#d

Here is another example:

Allow all targets and set queue size to 128.

a queue_size=128,max_cmd_per_lun=128

root@n009:~# systemctl | grep srp_d

srp_daemon.service loaded active exited Daemon that discovers and logs in to SRP target systems

srp_daemon_port@mlx5_0:1.service loaded active running SRP daemon that monitors port mlx5_0:1

srp_daemon_port@mlx5_1:1.service loaded active running SRP daemon that monitors port mlx5_1:1

system-srp_daemon_port.slice loaded active active system-srp_daemon_port.slice

Thoughts?

FYI; after

dpkg -i ./DEBS/srptools_51mlnx1-1.51066_amd64.deb

it complains. Then after reboot, and I do a “apt update; apt -y upgrade; apt autoremove” I get

W: APT had planned for dpkg to do more than it reported back (22 vs 26).

Affected packages: srptools:amd64

Then

root@n054:~# dpkg -a --configure

root@n054:~#

And consecutive “apt updates” no longer complains.

root@n054:~# systemctl | grep srp

srp_daemon.service loaded active exited Daemon that discovers and logs in to SRP target systems

srp_daemon_port@mlx5_0:1.service loaded active running SRP daemon that monitors port mlx5_0:1

srp_daemon_port@mlx5_1:1.service loaded active running SRP daemon that monitors port mlx5_1:1

srp_daemon_port@mlx5_2:1.service loaded active running SRP daemon that monitors port mlx5_2:1

srp_daemon_port@mlx5_3:1.service loaded active running SRP daemon that monitors port mlx5_3:1

system-srp_daemon_port.slice loaded active active system-srp_daemon_port.slice

root@n054:~#

Hope thats useful for some.

Hello,

Finally re-installed mlnxofed5.1 with srptools package like that:

sudo ./mlnxofedinstall --force --without-srptools --enable-opensm

Then I installed the package srptools with dpkg:

sudo dpkg -i srptools_51mlnx1-1.51066_amd64.deb

And finally mlnxofed works!

My other problem about “#ibhosts” command was solved using sudo.

Nevertheless it seems that linux kernel 5.4.0-47 implies the problem of srptool because before the version 5.4.0-42 there was no problem with this package.

Thanks for your advices!