Hi, i’ve installed TensorFlow v2.5.0+nv21.6 on my JetsonNano using the following guide Installing TensorFlow for Jetson Platform - NVIDIA Docs (replacing v46 with v45). The problem appears when i try to invoke inference after loading the TFLite Interpreter on the Jetson Nano:
Predicting with TensorFlowLite model
INFO: Created TensorFlow Lite delegate for select TF ops.
2022-01-31 20:33:10.112306: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2022-01-31 20:33:10.112463: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1734] Found device 0 with properties: pciBusID: 0000:00:00.0 name: NVIDIA Tegra X1 computeCapability: 5.3
coreClock: 0.9216GHz coreCount: 1 deviceMemorySize: 3.87GiB deviceMemoryBandwidth: 194.55MiB/s
2022-01-31 20:33:10.112695: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2022-01-31 20:33:10.112906: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2022-01-31 20:33:10.112978: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1872] Adding visible gpu devices: 0
2022-01-31 20:33:10.113055: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2022-01-31 20:33:10.113094: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264] 0
2022-01-31 20:33:10.113125: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0: N
2022-01-31 20:33:10.113333: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2022-01-31 20:33:10.113560: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1001] ARM64 does not support NUMA - returning NUMA node zero
2022-01-31 20:33:10.113666: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 242 MB memory) → physical GPU (device: 0, name: NVIDIA Tegra X1, pci bus id: 0000:00:00.0, compute capability: 5.3)
INFO: TfLiteFlexDelegate delegate: 4 nodes delegated out of 12 nodes with 3 partitions.
INFO: TfLiteFlexDelegate delegate: 0 nodes delegated out of 1 nodes with 0 partitions.
INFO: TfLiteFlexDelegate delegate: 2 nodes delegated out of 8 nodes with 2 partitions.
Segmentation fault (core dumped)
Using:
Python 3.6.9
Numpy v1.19.5
JetPack 4.5.1
CUDA 10.2.89
CUDNN: 8.0.0.180
TensorFlow v2.5.0+nv21.6
Model architectures: LSTM and Echo State Network
Here’s the code used to invoke inference:
#load TFLite interpreter
interpreter = tf.lite.Interpreter(model_path=self.model_path)
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()#simulate data arriving in batches, predict each batch
for i in range(0, num_batches + 1):
prior_idx = i * self.config.batch_size
idx = (i + 1) * self.config.batch_size#resize input tensor to X_test.shape size interpreter.resize_tensor_input(input_details[0]['index'], [idx - prior_idx, channel.X_test.shape[1], channel.X_test.shape[2]]) interpreter.allocate_tensors() X_test_batch = channel.X_test[prior_idx:idx] interpreter.set_tensor(interpreter.get_input_details()[0]['index'], X_test_batch) interpreter.invoke()
I put some prints, the model is loaded correctly (input_details and output_details give me the correct output) and seems the problem is in the invoke() method.
I also tried to replicate this issue on my Windows 11 laptop using TF v2.5.0 (the same used on the Jetson) with Python 3.9 and there inference was running without any problems, so it seems the problem is not in the code but in the Jetson Nano. How can i do?
Thanks!