Dear All,
Good eveining,
I am novice in CUDA world, trying to configure my Ubuntu 22.04 LTS for doing GPU programming , I have the following NVIDIA cards
lspci | grep -i nvid
06:00.0 3D controller: NVIDIA Corporation GV100GL [Tesla V100 PCIe 16GB] (rev a1)
86:00.0 3D controller: NVIDIA Corporation GV100GL [Tesla V100 PCIe 16GB] (rev a1)
And my nvidia-smi shows results
$ nvidia-smi
Wed Nov 1 20:40:39 2023
±--------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.29.02 Driver Version: 545.29.02 CUDA Version: 12.3 |
|-----------------------------------------±---------------------±---------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 Tesla V100-PCIE-16GB Off | 00000000:06:00.0 Off | 0 |
| N/A 22C P0 23W / 250W | 4MiB / 16384MiB | 0% Default |
| | | N/A |
±----------------------------------------±---------------------±---------------------+
| 1 Tesla V100-PCIE-16GB Off | 00000000:86:00.0 Off | 0 |
| N/A 21C P0 23W / 250W | 4MiB / 16384MiB | 0% Default |
| | | N/A |
±----------------------------------------±---------------------±---------------------+±--------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
±--------------------------------------------------------------------------------------+
When I try to do a simple python code, to list the GPU, it is throwing error , as given below
python3 -c “import tensorflow as tf; print(tf.config.list_physical_devices(‘GPU’))”
2023-11-01 20:36:46.806147: I tensorflow/core/util/port.cc:111] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variableTF_ENABLE_ONEDNN_OPTS=0
.
2023-11-01 20:36:46.839695: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2023-11-01 20:36:46.839724: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2023-11-01 20:36:46.839746: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2023-11-01 20:36:46.845819: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-11-01 20:36:47.592661: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
/usr/lib/python3/dist-packages/scipy/init.py:146: UserWarning: A NumPy version >=1.17.3 and <1.25.0 is required for this version of SciPy (detected version 1.26.1
warnings.warn(f"A NumPy version >={np_minversion} and <{np_maxversion}"
2023-11-01 20:36:48.229782: W tensorflow/core/common_runtime/gpu/gpu_device.cc:2211] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at Install TensorFlow with pip for how to download and setup the required libraries for your platform.
Skipping registering GPU devices…
I would like to solve it by reading some how to do for novice gpu person, request any links which points to good documentation for a beginner to start with CUDA and GPU programming
Thanks
Joseph John