Hi,

I encountered an error when I moved to a new GPU RTX3060. I tried to follow many different suggestions to resolve but it is unsuccessful.

Please let me know some suggestions or resources on how to proceed.

I am pasting my code and error below:-

My Code:-

import tensorflow as tf

import numpy as np

import os

#import tensorflow.compat.v1 as tf

path = ‘D:\user\GPU Testing code’

run_opts = tf.RunOptions(report_tensor_allocations_upon_oom = True)

folder = ‘imagefile.npz’

#Load npz files

path2 = os.path.join(path,folder)

for_npz = np.load(path2)

filenames1 = for_npz[‘filename_heads’]

X1 = for_npz[‘features’]

print(len(filenames1))

#similariMat = tf.keras.losses.cosine_similarity(y_true=X1, y_pred=X1, axis = 1)

%%time

#Once this code starts running, check the performance of GPU whether it increases or only the CPU utilization increases

with tf.compat.v1.Session() as sess:

for i in range(0, X1.shape[0], 100):

if i == 0:

Y_M, Y_N = X1.shape

Y = tf.placeholder(tf.float32, shape = (Y_M, Y_N))

Y_normalized = tf.nn.l2_normalize(Y, dim = 1)

M = X1[i:(i+100)].shape[0]

N = X1.shape[1]

X = X1[i:(i+100)]

# input

input = tf.placeholder(tf.float32, shape = (M, N))

# normalize each row

normalized = tf.nn.l2_normalize(input, dim = 1)

# multiply row i with row j using transpose

# element wise product

prod = tf.matmul(normalized, Y_normalized,

adjoint_b = True # transpose second matrix

)

dist = 1 - prod

Sim_Mat = sess.run(dist, feed_dict = {input:X,

Y:X1})

sess.close()

Error:

InternalError: 2 root error(s) found.

(0) Internal: Blas GEMM launch failed : a.shape=(100, 2048), b.shape=(301718, 2048), m=100, n=301718, k=2048

[[{{node MatMul}}]]

(1) Internal: Blas GEMM launch failed : a.shape=(100, 2048), b.shape=(301718, 2048), m=100, n=301718, k=2048

[[{{node MatMul}}]]

[[sub/_5]]

0 successful operations.

0 derived errors ignored.

My configurations:

Windows 10, RTX3060 , 12GB GPU memory

Python 3.6.13

cudatoolkit 11.3.1

cudnn 8.2.1 cuda11.3_0 anaconda

pytorch 1.10.2 py3.6_cuda11.3_cudnn8_0 pytorch

h5py 3.1.0 pypi_0 pypi

keras 2.6.0 pypi_0 pypi

keras-applications 1.0.8 pypi_0 pypi

keras-preprocessing 1.1.2 pypi_0 pypi

keras-vggface 0.6 pypi_0 pypi

keras-preprocessing 1.1.2 pyhd3eb1b0_0

python 3.7.6 h0371630_2

tensorboard 1.15.0 pypi_0 pypi

tensorboard-data-server 0.6.1 pypi_0 pypi

tensorboard-plugin-wit 1.8.1 pypi_0 pypi

tensorflow-estimator 1.15.1 pypi_0 pypi

tensorflow-gpu 1.15.0 pypi_0 pypi

When I run my code with a small size of imagefile.npz around 32 MB it gets executes successfully, But when I test with .npz of 1 GB it crashes with the above-mentioned error.

Thank you