Hi all, here I am again. The container images 18.09 and 18.10 look very unstable with the nvidia driver 384.81. I am getting again the same incompatibility errors. See below:
nvidia-docker run -it --rm --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 nvcr.io/nvidia/tensorflow:18.09-py3
================
== TensorFlow ==
================
NVIDIA Release 18.09 (build 687558)
Container image Copyright (c) 2018, NVIDIA CORPORATION. All rights reserved.
Copyright 2017 The TensorFlow Authors. All rights reserved.
Various files include modifications (c) NVIDIA CORPORATION. All rights reserved.
NVIDIA modifications are covered by the license terms that apply to the underlying project or file.
ERROR: This container was built for NVIDIA Driver Release 410 or later, but
version 384.81 was detected and compatibility mode is UNAVAILABLE.
[[CUDA Driver UNAVAILABLE (cuDevicePrimaryCtxRetain() returned 2)]]
nvidia-docker run -it --rm --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 nvcr.io/nvidia/tensorflow:18.10-py
================
== TensorFlow ==
================
NVIDIA Release 18.10 (build 785222)
Container image Copyright (c) 2018, NVIDIA CORPORATION. All rights reserved.
Copyright 2017 The TensorFlow Authors. All rights reserved.
Various files include modifications (c) NVIDIA CORPORATION. All rights reserved.
NVIDIA modifications are covered by the license terms that apply to the underlying project or file.
ERROR: This container was built for NVIDIA Driver Release 410 or later, but
version 384.81 was detected and compatibility mode is UNAVAILABLE.
[[CUDA Driver UNAVAILABLE (cuDevicePrimaryCtxRetain() returned 2)]]
NOTE: Detected MOFED driver 4.3-1.0.1; attempting to automatically upgrade.
(Reading database ... 16727 files and directories currently installed.)
Preparing to unpack .../ibverbs-utils_41mlnx1-OFED.4.3.0.1.8.43101_amd64.deb ...
Unpacking ibverbs-utils (41mlnx1-OFED.4.3.0.1.8.43101) over (1.2.1mlnx1-OFED.4.0.0.1.3.40101) ...
Preparing to unpack .../libibverbs-dev_41mlnx1-OFED.4.3.0.1.8.43101_amd64.deb ...
Unpacking libibverbs-dev (41mlnx1-OFED.4.3.0.1.8.43101) over (1.2.1mlnx1-OFED.4.0.0.1.3.40101) ...
Preparing to unpack .../libibverbs1_41mlnx1-OFED.4.3.0.1.8.43101_amd64.deb ...
Unpacking libibverbs1 (41mlnx1-OFED.4.3.0.1.8.43101) over (1.2.1mlnx1-OFED.4.0.0.1.3.40101) ...
Preparing to unpack .../libmlx5-1_41mlnx1-OFED.4.3.0.2.1.43101_amd64.deb ...
Unpacking libmlx5-1 (41mlnx1-OFED.4.3.0.2.1.43101) over (1.2.1mlnx1-OFED.4.0.0.1.1.40101) ...
Setting up libibverbs1 (41mlnx1-OFED.4.3.0.1.8.43101) ...
Setting up libmlx5-1 (41mlnx1-OFED.4.3.0.2.1.43101) ...
Setting up ibverbs-utils (41mlnx1-OFED.4.3.0.1.8.43101) ...
Setting up libibverbs-dev (41mlnx1-OFED.4.3.0.1.8.43101) ...
Processing triggers for libc-bin (2.23-0ubuntu10) ...
NOTE: MOFED driver was detected, but nv_peer_mem driver was not detected.
Multi-node communication performance may be reduced.
some recommendations?