Not sure where to post this issue. I have recently upgraded my system and the following packages changed versions:
- CUDA 10.2 → 11.0
- cuDNN 7.6 → 8.0
- TensorFlow-CUDA 2.2 → 2.3
Before that everything seemed to work as it should (I have been using TensorFlow with no apparent issue). Now TensorFlow doesn’t seem to work anymore. For example the code:
import tensorflow as tf
m = tf.keras.Sequential()
Will produce the following error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.8/site-packages/tensorflow/python/training/tracking/base.py", line 457, in _method_wrapper
result = method(self, *args, **kwargs)
File "/usr/lib/python3.8/site-packages/tensorflow/python/keras/engine/sequential.py", line 116, in __init__
super(functional.Functional, self).__init__( # pylint: disable=bad-super-call
File "/usr/lib/python3.8/site-packages/tensorflow/python/training/tracking/base.py", line 457, in _method_wrapper
result = method(self, *args, **kwargs)
File "/usr/lib/python3.8/site-packages/tensorflow/python/keras/engine/training.py", line 308, in __init__
self._init_batch_counters()
File "/usr/lib/python3.8/site-packages/tensorflow/python/training/tracking/base.py", line 457, in _method_wrapper
result = method(self, *args, **kwargs)
File "/usr/lib/python3.8/site-packages/tensorflow/python/keras/engine/training.py", line 317, in _init_batch_counters
self._train_counter = variables.Variable(0, dtype='int64', aggregation=agg)
File "/usr/lib/python3.8/site-packages/tensorflow/python/ops/variables.py", line 262, in __call__
return cls._variable_v2_call(*args, **kwargs)
File "/usr/lib/python3.8/site-packages/tensorflow/python/ops/variables.py", line 244, in _variable_v2_call
return previous_getter(
File "/usr/lib/python3.8/site-packages/tensorflow/python/ops/variables.py", line 237, in <lambda>
previous_getter = lambda **kws: default_variable_creator_v2(None, **kws)
File "/usr/lib/python3.8/site-packages/tensorflow/python/ops/variable_scope.py", line 2633, in default_variable_creator_v2
return resource_variable_ops.ResourceVariable(
File "/usr/lib/python3.8/site-packages/tensorflow/python/ops/variables.py", line 264, in __call__
return super(VariableMetaclass, cls).__call__(*args, **kwargs)
File "/usr/lib/python3.8/site-packages/tensorflow/python/ops/resource_variable_ops.py", line 1507, in __init__
self._init_from_args(
File "/usr/lib/python3.8/site-packages/tensorflow/python/ops/resource_variable_ops.py", line 1661, in _init_from_args
handle = eager_safe_variable_handle(
File "/usr/lib/python3.8/site-packages/tensorflow/python/ops/resource_variable_ops.py", line 242, in eager_safe_variable_handle
return _variable_handle_from_shape_and_dtype(
File "/usr/lib/python3.8/site-packages/tensorflow/python/ops/resource_variable_ops.py", line 174, in _variable_handle_from_shape_and_dtype
gen_logging_ops._assert( # pylint: disable=protected-access
File "/usr/lib/python3.8/site-packages/tensorflow/python/ops/gen_logging_ops.py", line 49, in _assert
_ops.raise_from_not_ok_status(e, name)
File "/usr/lib/python3.8/site-packages/tensorflow/python/framework/ops.py", line 6843, in raise_from_not_ok_status
six.raise_from(core._status_to_exception(e.code, message), None)
File "<string>", line 3, in raise_from
tensorflow.python.framework.errors_impl.InvalidArgumentError: assertion failed: [0] [Op:Assert] name: EagerVariableNameReuse
The above code works fine when the GPU is deactivated. Not sure what to do now. Is this a bug with CUDA or cuDNN or the Nvidia driver? What should I do to solve the issue? Should I submit a bug report to Nvidia?