I’m currently using a Jetson Nano with the following spec:
JetPack 4.6
CUDA 10.2
CudNN 8
Tensorflow 2.5.0+nv21.7
Trying the following code sample returns an unexpected error:
import numpy as np
import tensorflow as tf
D = tf.convert_to_tensor(np.array([[1., 2., 3.], [-3., -7., -1.], [0., 5., -2.]]))
print(tf.linalg.det(D))
Error:
...
F tensorflow/core/kernels/linalg/determinant_op_gpu.cu.cc:102] Non-OK-status: GpuLaunchKernel( DeterminantFromPivotedLUKernel<Scalar, false>, config.block_count, config.thread_per_block, 0, device.stream(), config.virtual_thread_count, n, lu_factor.data(), pivots, nullptr, output.data()) status: Internal: too many resources requested for launch
Is there any known solution for the what seems to be an issue with the number of threads being used by each block?