Configuring multiple versions of TensorRT and Tensorflow on HPC share cluster; TF-TRT Warning: Cannot dlopen some TensorRT libraries

We use Bright Computing for provisioning nodes on RHEL 9 and have cuda 11.7 and cuda 11.8 available as modules, as well as cudnn 8.5 for cuda 11.7 and cudnn 8.8 for cuda 11.8. I also created a module for cutensor-cuda11.7.

We also have various modules for Python, e.g., mamba with Python 3.11, Anaconda Python 3.9.10. Tensorflow 2.11.0 was installed via pip with --user.

NVIDIA-SMI 520.61.05 Driver Version: 520.61.05 CUDA Version: 11.8 and NVIDIA RTX A6000 are the GPUs

What’s the reason for TF not finding the GPUs?

2023-03-30 11:54:34.772791: I tensorflow/core/platform/] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA

To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-03-30 11:54:35.566539: W tensorflow/compiler/xla/stream_executor/platform/default/] Could not load dynamic library ''; dlerror: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /cm/shared/apps/cutensor-cuda11.7/

2023-03-30 11:54:35.566613: W tensorflow/compiler/xla/stream_executor/platform/default/] Could not load dynamic library ''; dlerror: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /cm/shared/apps/cutensor-cuda11.7/

2023-03-30 11:54:35.566627: W tensorflow/compiler/tf2tensorrt/utils/] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.

>>> print(tf.__version__)


2023-03-30 11:46:44.803644: I tensorflow/core/platform/] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA

To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.

2023-03-30 11:46:45.605164: W tensorflow/compiler/xla/stream_executor/platform/default/] Could not load dynamic library ''; dlerror: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /cm/shared/apps/cutensor-cuda11.7/

2023-03-30 11:46:45.605449: W tensorflow/compiler/xla/stream_executor/platform/default/] Could not load dynamic library ''; dlerror: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /cm/shared/apps/cutensor-cuda11.7/

2023-03-30 11:46:45.605462: W tensorflow/compiler/tf2tensorrt/utils/] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.

2023-03-30 11:46:47.410968: W tensorflow/compiler/xla/stream_executor/platform/default/] Could not load dynamic library ''; dlerror: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /cm/shared/apps/cutensor-cuda11.7/

2023-03-30 11:46:47.411008: W tensorflow/core/common_runtime/gpu/] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at for how to download and setup the required libraries for your platform.

Skipping registering GPU devices...

But at least on cuda 11.8 the GPU is found:


2023-03-30 12:04:52.263547: I tensorflow/core/platform/] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.

To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.

2023-03-30 12:04:53.184678: W tensorflow/compiler/tf2tensorrt/utils/] TF-TRT Warning: Could not find TensorRT

2023-03-30 12:04:55.144931: I tensorflow/core/common_runtime/gpu/] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 46672 MB memory: -> device: 0, name: NVIDIA RTX A6000, pci bus id: 0000:c1:00.0, compute capability: 8.6

Epoch 1/10

2023-03-30 12:04:56.798351: I tensorflow/compiler/xla/stream_executor/cuda/] TensorFloat-32 will be used for the matrix multiplication. This will only be logged once.

2023-03-30 12:04:56.967906: I tensorflow/compiler/xla/service/] XLA service 0x1547d7dfd390 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:

2023-03-30 12:04:56.967960: I tensorflow/compiler/xla/service/] StreamExecutor device (0): NVIDIA RTX A6000, Compute Capability 8.6

2023-03-30 12:04:56.971514: I tensorflow/compiler/mlir/tensorflow/utils/] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.

2023-03-30 12:04:57.084710: I tensorflow/compiler/xla/stream_executor/cuda/] Loaded cuDNN version 8801

2023-03-30 12:04:57.094134: W tensorflow/compiler/xla/service/gpu/llvm_gpu_backend/] Can't find libdevice directory ${CUDA_DIR}/nvvm/libdevice. This may result in compilation or runtime failures, if the program we try to run uses routines from libdevice.

Searched for CUDA in the following directories:





You can choose the search directory by setting xla_gpu_cuda_data_dir in HloModule's DebugOptions. For most apps, setting the environment variable XLA_FLAGS=--xla_gpu_cuda_data_dir=/path/to/cuda will work.

2023-03-30 12:04:57.094329: W tensorflow/compiler/xla/service/gpu/llvm_gpu_backend/] libdevice is required by this HLO module but was not found at ./libdevice.10.bc

2023-03-30 12:04:57.094586: W tensorflow/core/framework/] OP_REQUIRES failed at : INTERNAL: libdevice not found at ./libdevice.10.bc

2023-03-30 12:04:57.094610: I tensorflow/core/common_runtime/] [/job:localhost/replica:0/task:0/device:GPU:0] (DEBUG INFO) Executor start aborting (this does not indicate an error and you can ignore this message): INTERNAL: libdevice not found at ./libdevice.10.bc

[[{{node StatefulPartitionedCall_2}}]]

2023-03-30 12:04:57.111179: W tensorflow/compiler/xla/service/gpu/llvm_gpu_backend/] libdevice is required by this HLO module but was not found at ./libdevice.10.bc

2023-03-30 12:04:57.111354: W tensorflow/core/framework/] OP_REQUIRES failed at : INTERNAL: libdevice not found at ./libdevice.10.bc

2023-03-30 12:04:57.155357: W tensorflow/compiler/xla/service/gpu/llvm_gpu_backend/] libdevice is required by this HLO module but was not found at ./libdevice.10.bc

2023-03-30 12:04:57.155587: W tensorflow/core/framework/] OP_REQUIRES failed at : INTERNAL: libdevice not found at ./libdevice.10.bc

2023-03-30 12:04:57.171499: W tensorflow/compiler/xla/service/gpu/llvm_gpu_backend/] libdevice is required by this HLO module but was not found at ./libdevice.10.bc

2023-03-30 12:04:57.171675: W tensorflow/core/framework/] OP_REQUIRES failed at : INTERNAL: libdevice not found at ./libdevice.10.bc
1 Like

Please check the below links, as they might answer your concerns.


Nothing there about multiple versions. Any other specific suggestions?

Hi @rk3199 ,

We are checking on this. Will update you on the same.


Hi @rk3199 ,
Did you try completely removing CUDA and reinstall it again.


No as this is in a cluster a loaded as a module.

Lower versions of Python,. e.g., 3.7 does not generate this error/warning.

Hi @rk3199 ,
Can you please share the TRT version you are using?
Also if you can try an upgrade and let us know if issue is still there?


pip list | grep -i tensorrt
WARNING: Ignoring invalid distribution -ensorflow (/path/to/me/.local/lib/python3.9/site-packages)
tensorrt 8.6.1
tensorrt-bindings 8.6.1
tensorrt-dispatch 8.6.0
tensorrt-lean 8.6.0
tensorrt-libs 8.6.1

Upgrade what? We use modules so I can install specific versions.

1 Like

Hi, I have the same problem using pyhton 3.9.4, cuda and cudnn 11.5 on HPC. Any solution so far?

1 Like