GTX 1660 Ti - Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR

Hi, were having an issue running a number of models on a 1660 Ti. We tested it in both Ubuntu 18.04.3 LTS and CentOS 7. Error is “Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR”. There seems to be a suggested fix: Add “config.gpu_options.allow_growth = True” which we did, but it doesn’t seem to help. We installed driver version “440.59”.

import keras.backend as K
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
session = tf.Session(config=config)
K.set_session(session)

Here is an example model that’s failing
https://github.com/bedapudi6788/NudeNet/releases/download/v0/classifier_model

Hi,

This could be due to OOM. Could you try to reduce the TF GPU memory fraction: config.gpu_options.per_process_gpu_memory_fraction

Thanks

This didnt seem to help. The part that’s really baffling me is that this exact same model works fine on a much more lower-end P1000 GPU

Hi,

Could you please share the sample repro script and model file so we can help better?

Also, can you provide details on the platforms you are using:
o CUDA version
o CUDNN version
o Python version [if using python]
o Tensorflow and PyTorch version
o TensorRT version

Thanks

Hi,

See answers below. The notebook is attached, and the model URL is in the original post

o CUDA version - CUDA Version: 10.0
o CUDNN version - CUDNN Version 7.6.2 (also tried 7.6.5, same result)
o Python version [if using python] - Python 3.6.8
o Tensorflow and PyTorch version - TF version: 1.15.0, no PyTorch
o TensorRT version - not installed
05_nudenet.zip (4.03 KB)

For what it’s worth, I’m experiencing the same problem with a laptop that has a 1660 Ti.

Thanks for the repro. When I run it on a 32GB V100, Keras grabs 95% of the GPU memory regardless of whether K.set_session() is called.

This seems to be a problem in Keras, and there seems to be an existing issue tracking it. https://github.com/keras-team/keras/issues/11584