Orin Nano w/JetPack 5.1.2 which included CUDA11.4 and cuDNN 8.6 according to jtop.
Installed the tensorflow package 2.12.0+nv23.6 from https://developer.download.nvidia.com/compute/redist/jp/v512.
Basic tensorflow operations work fine, report the GPU device, etc.
However, anything that requires cuDNN fails with:
2024-02-14 18:11:02.216299: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:429] Could not create cudnn handle: CUDNN_STATUS_NOT_INITIALIZED
2024-02-14 18:11:02.216522: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:434] Error retrieving driver version: NOT_FOUND: could not find kernel module information in driver version file contents: "NVRM version: NVIDIA UNIX Open Kernel Module for aarch64 35.4.1 Release Build (buildbrain@mobile-u64-6422-d7000) Tue Aug 1 12:45:41 PDT 2023
GCC version: gcc version 9.3.0 (Buildroot 2020.08)
"
2024-02-14 18:11:02.216689: W tensorflow/core/framework/op_kernel.cc:1830] OP_REQUIRES failed at conv_ops_fused_impl.h:624 : UNIMPLEMENTED: DNN library is not found.
The tensorflow page indicates 2.12 requires CUDA 11.8 and cuDNN 8.6 so I followed the info in another post to upgrade CUDA to 11.8, rebooted, but the same error remains.
So, woke up this morning, turned on the device, and re-ran the script before posting and… it worked. Sorry for the false alarm. Here was the script for reference:
import tensorflow as tf
import numpy as np
from tensorflow.keras import layers
from tensorflow.keras.models import Sequential
physical_devices = tf.config.list_physical_devices('GPU')
tf.config.experimental.set_memory_growth(physical_devices[0], True)
x = np.random.normal(size=(100, 28, 28, 1)).astype(np.float32)
y = np.zeros([100, 10], dtype=np.float32)
y[:, 1] = 1.
train_ds = tf.data.Dataset.from_tensor_slices((x, y)).shuffle(buffer_size=100).batch(32)
num_classes = 10
model = Sequential([
layers.Conv2D(16, 3, padding='same', activation='relu'),
layers.MaxPooling2D(),
layers.Conv2D(32, 3, padding='same', activation='relu'),
layers.MaxPooling2D(),
layers.Conv2D(64, 3, padding='same', activation='relu'),
layers.MaxPooling2D(),
layers.Flatten(),
layers.Dense(128, activation='relu'),
layers.Dense(num_classes)
])
model.compile(optimizer='adam',
loss=tf.keras.losses.CategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
epochs=10
history = model.fit(
train_ds,
epochs=epochs
)