Internal: Blas GEMM launch failed

sbk.sk48 · February 24, 2023, 3:00pm

Hi,

I encountered an error when I moved to a new GPU RTX3060. I tried to follow many different suggestions to resolve but it is unsuccessful.
Please let me know some suggestions or resources on how to proceed.

I am pasting my code and error below:-

My Code:-
import tensorflow as tf
import numpy as np
import os
#import tensorflow.compat.v1 as tf

path = ‘D:\user\GPU Testing code’
run_opts = tf.RunOptions(report_tensor_allocations_upon_oom = True)
folder = ‘imagefile.npz’

#Load npz files
path2 = os.path.join(path,folder)

for_npz = np.load(path2)
filenames1 = for_npz[‘filename_heads’]
X1 = for_npz[‘features’]

print(len(filenames1))
#similariMat = tf.keras.losses.cosine_similarity(y_true=X1, y_pred=X1, axis = 1)

%%time
#Once this code starts running, check the performance of GPU whether it increases or only the CPU utilization increases
with tf.compat.v1.Session() as sess:
for i in range(0, X1.shape[0], 100):
if i == 0:
Y_M, Y_N = X1.shape
Y = tf.placeholder(tf.float32, shape = (Y_M, Y_N))
Y_normalized = tf.nn.l2_normalize(Y, dim = 1)
M = X1[i:(i+100)].shape[0]
N = X1.shape[1]
X = X1[i:(i+100)]
# input
input = tf.placeholder(tf.float32, shape = (M, N))
# normalize each row
normalized = tf.nn.l2_normalize(input, dim = 1)
# multiply row i with row j using transpose
# element wise product
prod = tf.matmul(normalized, Y_normalized,
adjoint_b = True # transpose second matrix
)
dist = 1 - prod
Sim_Mat = sess.run(dist, feed_dict = {input:X,
Y:X1})

sess.close()

Error:
InternalError: 2 root error(s) found.
(0) Internal: Blas GEMM launch failed : a.shape=(100, 2048), b.shape=(301718, 2048), m=100, n=301718, k=2048
[[{{node MatMul}}]]
(1) Internal: Blas GEMM launch failed : a.shape=(100, 2048), b.shape=(301718, 2048), m=100, n=301718, k=2048
[[{{node MatMul}}]]
[[sub/_5]]
0 successful operations.
0 derived errors ignored.

My configurations:

Windows 10, RTX3060 , 12GB GPU memory

Python 3.6.13
cudatoolkit 11.3.1
cudnn 8.2.1 cuda11.3_0 anaconda
pytorch 1.10.2 py3.6_cuda11.3_cudnn8_0 pytorch
h5py 3.1.0 pypi_0 pypi
keras 2.6.0 pypi_0 pypi
keras-applications 1.0.8 pypi_0 pypi
keras-preprocessing 1.1.2 pypi_0 pypi
keras-vggface 0.6 pypi_0 pypi
keras-preprocessing 1.1.2 pyhd3eb1b0_0
python 3.7.6 h0371630_2
tensorboard 1.15.0 pypi_0 pypi
tensorboard-data-server 0.6.1 pypi_0 pypi
tensorboard-plugin-wit 1.8.1 pypi_0 pypi
tensorflow-estimator 1.15.1 pypi_0 pypi
tensorflow-gpu 1.15.0 pypi_0 pypi

When I run my code with a small size of imagefile.npz around 32 MB it gets executes successfully, But when I test with .npz of 1 GB it crashes with the above-mentioned error.

Thank you

Robert_Crovella · February 24, 2023, 3:25pm

I suggest finding a forum for tensorflow support. This forum isn’t intended for tensorflow support. Furthermore, tensorflow is not a NVIDIA product.

system · April 25, 2023, 5:38am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Error Internal: Blas GEMM launch failed GPU-Accelerated Libraries cuda , tensorflow , ubuntu , cublas	3	4846	April 30, 2022
Training ResNet50 with Tensorflow 1.5.0 on RTX 3070 problems Frameworks	0	751	September 16, 2021
Tlt-augment execution error occurs TAO Toolkit	3	4627	October 12, 2021
Repeated Beeping Noise and Loss Rapidly Decreasing When Training on Keras+Tensorflow Deep Learning (Training & Inference)	2	1234	October 12, 2021
Ptxas returned an error during compilation of ptx to sass: 'Internal: ptxas exited with non-zero error code -1 CUDA Setup and Installation cuda , tensorflow , ai-training	2	5264	January 8, 2024
Error during training using RTX3090 with TLT docker, it is ok with RTX2070 : failed to run cuBLAS routine: CUBLAS_STATUS_EXECUTION_FAILED TAO Toolkit	2	1768	October 12, 2021
Why can't I train with GPU after installing tensorflow? Jetson Orin NX tensorflow	4	645	January 17, 2024
Nvidia Modulus: failed to run cuBLAS routine: CUBLAS_STATUS_EXECUTION_FAILED Technical Support (PhysicsNeMo Only)	2	1759	May 18, 2022
[ubuntu1404][GTX-1080] Cublas handle: not initialized in driver version 384.111 Linux	6	5131	October 14, 2021
simplecublas kernel execution error Deep Learning (Training & Inference)	0	686	May 17, 2019

Internal: Blas GEMM launch failed

Related topics