Hi,
I an encountering an error when I moved to a new laptop with RTX3070. I am new to GPU world and I tried to follow some suggestions to resolve but it is unsuccessfull . Please let me know some suggestions or resources on how to proceed.
Error:
2022-04-28 23:09:24.631740: E tensorflow/stream_executor/cuda/cuda_blas.cc:428] failed to run cuBLAS routine: CUBLAS_STATUS_EXECUTION_FAILED
0%| | 0/101 [00:57<?, ?it/s]
Traceback (most recent call last):
File “/home/sahiti/anaconda3/envs/drp2/lib/python3.7/site-packages/tensorflow_core/python/client/session.py”, line 1365, in _do_call
return fn(*args)
File “/home/sahiti/anaconda3/envs/drp2/lib/python3.7/site-packages/tensorflow_core/python/client/session.py”, line 1350, in _run_fn
target_list, run_metadata)
File “/home/sahiti/anaconda3/envs/drp2/lib/python3.7/site-packages/tensorflow_core/python/client/session.py”, line 1443, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InternalError: 2 root error(s) found.
(0) Internal: Blas GEMM launch failed : a.shape=(192, 3), b.shape=(192, 3), m=3, n=3, k=192
[[{{node matrix_solve_ls_45/MatMul}}]]
[[gradients/AddN_685/_1851]]
(1) Internal: Blas GEMM launch failed : a.shape=(192, 3), b.shape=(192, 3), m=3, n=3, k=192
[[{{node matrix_solve_ls_45/MatMul}}]]
0 successful operations.
0 derived errors ignored.
My configurations:
Ubuntu 20.04, RTX3070 , 8GB GPU memory
cudatoolkit 10.0.130 0
cudnn 7.6.5 cuda10.0_0
h5py 2.10.0 pypi_0
keras-applications 1.0.8 py_1
keras-base 2.3.1 py37_0 anaconda
keras-gpu 2.3.1 0 anaconda
keras-preprocessing 1.1.2 pyhd3eb1b0_0
python 3.7.6 h0371630_2
tensorflow 1.15.0 gpu_py37h0f0df58_0
tensorflow-base 1.15.0 gpu_py37h9dcbed7_0
tensorflow-estimator 1.15.1 pyh2649769_0
tensorflow-gpu 1.15.0 h0d30ee6_0
I set gpu configuration for its memory to be allowed to grow and also continuously monitor with nvidia-smi as the program starts. The memory usage is below 1GB and also utilization is low. The program takes nearly 20minutes to start and crashes soon with above error.
Thank you.