Tensorflow issue:OP_REQUIRES failed at conv_ops_fused_impl.h:697 : Not found: No algorithm worked!

I’m trying to run a simple CNN Tensorflow model on my system Ubuntu 20.04 on laptop Razer 15 blade with an RTX 2060 Max-P with 6Gb GDDR6.
The model does not compile and I get the following error:
OP_REQUIRES failed at conv_ops_fused_impl.h:697 : Not found: No algorithm worked!.
I have NVIDIA driver (460.32.3) installed by Ubuntu, with cuda compiler:
nvcc: NVIDIA ® Cuda compiler driver
Copyright © 2005-2020 NVIDIA Corporation
Built on Wed_Jul_22_19:09:09_PDT_2020
Cuda compilation tools, release 11.0, V11.0.221
Build cuda_11.0_bu.TC445_37.28845127_0
I also installed cudnn from cudnn-11.0-linux-x64-v8.0.5.39.
Tensorflow installed is 2.4.1.
In my zshrc I have added:
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64"
export CUDA_HOME=/usr/local/cuda
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/extras/CUPTI/lib64

The only way to get things work is to add:
export TF_FORCE_GPU_ALLOW_GROWTH=‘true’
in my .zshrc, as suggested here.
This is really weird as I suppose that 6Gb of GDDR6 are enough for this CNN model.
I also tried to run ai-benchmark and I was not able to finish the tests while with the env variable set as shown above I get several warning regarding not having enough memory to allocate, but I’m able to end all tests.
If I use watch nvidia-smi while running the notebook code from CNN model I noticed that with the env set the GDRR6 allocated is quite small while is near the limit of the available memory if the env variable is not set.
What’s the problem? Any suggestion to fix it?

1 Like

Hi @ferlito.sergio ,
If you are using cuDNN then this might be related to large workspace required by cuDNN algos.
Below link should be able to help you-

Thanks!

1 Like

Hi!

I think I’m having the same problem, this time with Nvidia TLT container tlt-streamanalytics:yuw-v2 and a tlt_pretrained_detectnet_v2:resnet18 model, on a GeForce RTX 2060.
(I’m working on this tutorial: Implementing a Real-time, AI-Based, Face Mask Detector Application for COVID-19 )

print(tf.__version__)

Tensorflow version: 1.15.2

!nvcc --version

nvcc: NVIDIA ® Cuda compiler driver
Copyright © 2005-2020 NVIDIA Corporation
Built on Wed_Jul_22_19:09:09_PDT_2020
Cuda compilation tools, release 11.0, V11.0.221
Build cuda_11.0_bu.TC445_37.28845127_0

I’m using a different dataset, but everything seems OK with the tlt-dataset-convertoperation.

Now the training fails with the following error (see tlt-train.log (38.3 KB) )

OP_REQUIRES failed at conv_grad_filter_ops.cc:1038 : Not found: No algorithm worked!

I tried !export TF_FORCE_GPU_ALLOW_GROWTH=‘true’ inside my notebook but it does not fix the problem.

I also found this command, which do not fix the problem either:

from tensorflow.compat.v1 import ConfigProto
from tensorflow.compat.v1 import InteractiveSession

config = ConfigProto()
config.gpu_options.allow_growth = True
session = InteractiveSession(config=config)

And this command, which raises an error:

physical_devices = tf.config.list_physical_devices('GPU') 
tf.config.experimental.set_memory_growth(physical_devices[0], True)

AttributeError: module ‘tensorflow._api.v1.config’ has no attribute ‘list_physical_devices’

Can you please help me?

EDIT (SOLVED):

For TF 1.15, the last command was:

from tensorflow.config.experimental import list_physical_devices, set_memory_growth
physical_devices = list_physical_devices('GPU')
set_memory_growth(physical_devices[0], True)

And now it works!

1 Like

Adding the code below right after the imports solved the issue for me ! Thanks !

physical_devices = tf.config.list_physical_devices(‘GPU’)
tf.config.experimental.set_memory_growth(physical_devices[0], True)