Hello,
I am trying to get Tensorflow container running on WSL2 / Ubuntu 20.04 by following
CUDA on WSL :: CUDA Toolkit Documentation.
Here is the nvidia-smi info @ WSL2
NVIDIA-SMI 510.00 Driver Version: 510.06 CUDA Version: 11.6
I started the tensorflow container from WSL2, looks like the tf container did not detect the GPU driver as shown below while cuda container runs fine with GPU device detected.
Really appreciate any insight on why TF container is not detecting the full RTX3080 GPU Capability. Also evidenced that cudnn libary can not access NUMA node of GPU While running resnet training example.
nvidia-docker run --gpus all -it --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 nvcr.io/nv
idia/tensorflow:20.03-tf2-py3
================
== TensorFlow ==
NVIDIA Release 20.03-tf2 (build 11026100)
TensorFlow Version 2.1.0
Container image Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved.
Copyright 2017-2019 The TensorFlow Authors. All rights reserved.
Various files include modifications (c) NVIDIA CORPORATION. All rights reserved.
NVIDIA modifications are covered by the license terms that apply to the underlying project or file.
WARNING: The NVIDIA Driver was not detected. GPU functionality will not be available.
Use ‘nvidia-docker run’ to start this container; see
nvidia docker · NVIDIA/nvidia-docker Wiki · GitHub .
NOTE: MOFED driver for multi-node communication was not detecte
root@14f4c94fc193:/workspace/nvidia-examples# python cnn/resnet.py
2021-10-16 03:04:09.248403: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.2
2021-10-16 03:04:10.113198: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer.so.7
2021-10-16 03:04:10.113777: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer_plugin.so.7
PY 3.6.9 (default, Nov 7 2019, 10:44:02)
[GCC 8.3.0]
TF 2.1.0
Script arguments:
–image_width=224
–image_height=224
–distort_color=False
–momentum=0.9
–loss_scale=128.0
–image_format=channels_last
–data_dir=None
–data_idx_dir=None
–batch_size=256
–num_iter=300
–iter_unit=batch
–log_dir=None
–export_dir=None
–tensorboard_dir=None
–display_every=10
–precision=fp16
–dali_mode=None
–use_xla=False
–predict=False
2021-10-16 03:04:11.055143: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2021-10-16 03:04:11.168088: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2021-10-16 03:04:11.168134: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: NVIDIA GeForce RTX 3080 Laptop GPU computeCapability: 8.6
coreClock: 1.245GHz coreCount: 48 deviceMemorySize: 8.00GiB deviceMemoryBandwidth: 357.69GiB/s
2021-10-16 03:04:11.168149: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.2
2021-10-16 03:04:11.168184: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2021-10-16 03:04:11.178213: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2021-10-16 03:04:11.183327: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2021-10-16 03:04:11.203007: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2021-10-16 03:04:11.208487: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2021-10-16 03:04:11.208566: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2021-10-16 03:04:11.209146: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2021-10-16 03:04:11.209429: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2021-10-16 03:04:11.209453: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2021-10-16 03:04:11.232012: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3302415000 Hz
2021-10-16 03:04:11.236158: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x418ac50 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2021-10-16 03:04:11.236198: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2021-10-16 03:04:11.496015: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2021-10-16 03:04:11.496208: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x41576d0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2021-10-16 03:04:11.496241: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): NVIDIA GeForce RTX 3080 Laptop GPU, Compute Capability 8.6
2021-10-16 03:04:11.496569: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2021-10-16 03:04:11.496600: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: NVIDIA GeForce RTX 3080 Laptop GPU computeCapability: 8.6
coreClock: 1.245GHz coreCount: 48 deviceMemorySize: 8.00GiB deviceMemoryBandwidth: 357.69GiB/s
2021-10-16 03:04:11.496632: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.2
2021-10-16 03:04:11.496655: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2021-10-16 03:04:11.496703: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2021-10-16 03:04:11.496725: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2021-10-16 03:04:11.496752: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2021-10-16 03:04:11.496774: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2021-10-16 03:04:11.496793: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2021-10-16 03:04:11.497018: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2021-10-16 03:04:11.497292: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2021-10-16 03:04:11.497345: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2021-10-16 03:04:11.497374: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.2