Query Regarding Memory Expansion for Jetson Nano

I am currently working on a real-time object detection project using TensorFlow 2.4.1 on the NVIDIA Jetson Nano platform. While the model loads successfully, I encounter a resource exhaustion error during the initialization of the camera, which interrupts the process.

I am exploring the possibility of expanding the memory capacity of the Jetson Nano to alleviate resource constraints. However, I understand that traditional RAM expansion may not be feasible for the Jetson Nano.

For your reference, the software versions I am using are as follows:

TensorFlow 2.4.1
CUDA 10.2.3
Jetpack 4.6.1
NumPy 1.18.5
OpenCV 4.1.1
Could you please provide guidance or suggestions on alternative methods or hardware solutions for effectively managing memory resources on the Jetson Nano?

Please find attached the error

Blockquote
agribot@agribot-desktop:~/Documents$ python rpicam_ai_interface.py
2024-04-19 09:33:12.390909: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.2
2024-04-19 09:33:22.500048: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.2
2024-04-19 09:33:24.787176: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set
2024-04-19 09:33:24.872500: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcuda.so.1
2024-04-19 09:33:24.912208: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:908] ARM64 does not support NUMA - returning NUMA node zero
2024-04-19 09:33:24.912400: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties:
pciBusID: 0000:00:00.0 name: NVIDIA Tegra X1 computeCapability: 5.3
coreClock: 0.9216GHz coreCount: 1 deviceMemorySize: 3.86GiB deviceMemoryBandwidth: 194.55MiB/s
2024-04-19 09:33:24.912576: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.2
2024-04-19 09:33:24.912760: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.10
2024-04-19 09:33:24.912874: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.10
2024-04-19 09:33:24.912978: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10
2024-04-19 09:33:24.995252: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10
2024-04-19 09:33:25.115190: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.10
2024-04-19 09:33:25.197383: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.10
2024-04-19 09:33:25.199535: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8
2024-04-19 09:33:25.200009: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:908] ARM64 does not support NUMA - returning NUMA node zero
2024-04-19 09:33:25.200437: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:908] ARM64 does not support NUMA - returning NUMA node zero
2024-04-19 09:33:25.200530: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/util/dispatch.py:201: calling crop_and_resize_v1 (from tensorflow.python.ops.image_ops_impl) with box_ind is deprecated and will be removed in a future version.
Instructions for updating:
box_ind is deprecated, use box_indices instead
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/mrcnn/model.py:756: calling map_fn (from tensorflow.python.ops.map_fn) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Use fn_output_signature instead
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/mrcnn/model.py:774: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
2024-04-19 09:34:09.713886: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:908] ARM64 does not support NUMA - returning NUMA node zero
2024-04-19 09:34:09.721539: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties:
pciBusID: 0000:00:00.0 name: NVIDIA Tegra X1 computeCapability: 5.3
coreClock: 0.9216GHz coreCount: 1 deviceMemorySize: 3.86GiB deviceMemoryBandwidth: 194.55MiB/s
2024-04-19 09:34:09.721852: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.2
2024-04-19 09:34:09.721951: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.10
2024-04-19 09:34:09.722027: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.10
2024-04-19 09:34:09.722088: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10
2024-04-19 09:34:09.722224: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10
2024-04-19 09:34:09.722346: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.10
2024-04-19 09:34:09.722448: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.10
2024-04-19 09:34:09.722548: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8
2024-04-19 09:34:09.722879: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:908] ARM64 does not support NUMA - returning NUMA node zero
2024-04-19 09:34:09.723181: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:908] ARM64 does not support NUMA - returning NUMA node zero
2024-04-19 09:34:09.723256: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0
2024-04-19 09:34:15.919753: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1261] Device interconnect StreamExecutor with strength 1 edge matrix:
2024-04-19 09:34:15.919846: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1267] 0
2024-04-19 09:34:15.919886: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1280] 0: N
2024-04-19 09:34:15.920430: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:908] ARM64 does not support NUMA - returning NUMA node zero
2024-04-19 09:34:15.920829: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:908] ARM64 does not support NUMA - returning NUMA node zero
2024-04-19 09:34:15.921170: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:908] ARM64 does not support NUMA - returning NUMA node zero
2024-04-19 09:34:15.921310: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1406] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1024 MB memory) → physical GPU (device: 0, name: NVIDIA Tegra X1, pci bus id: 0000:00:00.0, compute capability: 5.3)
2024-04-19 09:34:15.921886: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set
2024-04-19 09:34:17.483500: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:196] None of the MLIR optimization passes are enabled (registered 0 passes)
2024-04-19 09:34:18.188437: W tensorflow/core/platform/profile_utils/cpu_utils.cc:116] Failed to find bogomips or clock in /proc/cpuinfo; cannot determine CPU frequency
[INFO] [rpicam_ai_node]: MODEL IS READY NOW
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py:2325: UserWarning: Model.state_updates will be removed in a future version. This property should not be used in TensorFlow 2.0, as updates are applied automatically.
warnings.warn('Model.state_updates will be removed in a future version. ’
2024-04-19 09:34:45.466099: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.10
2024-04-19 09:34:50.087615: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8
2024-04-19 09:35:27.747811: W tensorflow/core/common_runtime/bfc_allocator.cc:248] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.06GiB with freed_by_count=0. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2024-04-19 09:35:28.658775: W tensorflow/core/common_runtime/bfc_allocator.cc:248] Allocator (GPU_0_bfc) ran out of memory trying to allocate 1.14GiB with freed_by_count=0. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2024-04-19 09:35:30.328360: W tensorflow/core/common_runtime/bfc_allocator.cc:248] Allocator (GPU_0_bfc) ran out of memory trying to allocate 594.25MiB with freed_by_count=0. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2024-04-19 09:35:34.007737: W tensorflow/core/common_runtime/bfc_allocator.cc:248] Allocator (GPU_0_bfc) ran out of memory trying to allocate 1.09GiB with freed_by_count=0. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2024-04-19 09:35:34.979071: W tensorflow/core/common_runtime/bfc_allocator.cc:248] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.09GiB with freed_by_count=0. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2024-04-19 09:35:36.612015: W tensorflow/core/common_runtime/bfc_allocator.cc:248] Allocator (GPU_0_bfc) ran out of memory trying to allocate 1.08GiB with freed_by_count=0. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2024-04-19 09:35:41.500745: W tensorflow/core/common_runtime/bfc_allocator.cc:248] Allocator (GPU_0_bfc) ran out of memory trying to allocate 1.15GiB with freed_by_count=0. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2024-04-19 09:35:47.285971: E tensorflow/stream_executor/cuda/cuda_driver.cc:936] failed to synchronize the stop event: CUDA_ERROR_LAUNCH_TIMEOUT: the launch timed out and was terminated
2024-04-19 09:35:47.286082: E tensorflow/stream_executor/gpu/gpu_timer.cc:55] Internal: Error destroying CUDA event: CUDA_ERROR_LAUNCH_TIMEOUT: the launch timed out and was terminated
2024-04-19 09:35:47.286132: E tensorflow/stream_executor/gpu/gpu_timer.cc:60] Internal: Error destroying CUDA event: CUDA_ERROR_LAUNCH_TIMEOUT: the launch timed out and was terminated
2024-04-19 09:35:47.286328: I tensorflow/stream_executor/cuda/cuda_driver.cc:789] failed to allocate 8B (8 bytes) from device: CUDA_ERROR_LAUNCH_TIMEOUT: the launch timed out and was terminated
2024-04-19 09:35:47.286396: E tensorflow/stream_executor/stream.cc:5011] Internal: Failed to enqueue async memset operation: CUDA_ERROR_LAUNCH_TIMEOUT: the launch timed out and was terminated
2024-04-19 09:35:47.286460: W tensorflow/core/kernels/gpu_utils.cc:69] Failed to check cudnn convolutions for out-of-bounds reads and writes with an error message: ‘Failed to load in-memory CUBIN: CUDA_ERROR_LAUNCH_TIMEOUT: the launch timed out and was terminated’; skipping this check. This only means that we won’t check cudnn for out-of-bounds reads and writes. This message will only be printed once.
2024-04-19 09:35:47.286521: I tensorflow/stream_executor/cuda/cuda_driver.cc:789] failed to allocate 8B (8 bytes) from device: CUDA_ERROR_LAUNCH_TIMEOUT: the launch timed out and was terminated
2024-04-19 09:35:47.286560: E tensorflow/stream_executor/stream.cc:5011] Internal: Failed to enqueue async memset operation: CUDA_ERROR_LAUNCH_TIMEOUT: the launch timed out and was terminated
2024-04-19 09:35:47.286756: F tensorflow/stream_executor/cuda/cuda_dnn.cc:189] Check failed: status == CUDNN_STATUS_SUCCESS (7 vs. 0)Failed to set cuDNN stream.
Aborted (core dumped)

Hi,

 Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.06GiB with freed_by_count=0. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.

Based on the message, the model is too heavy for Jetson Nano.
Please use other lightweight models or try to convert it into TensorRT.
(since the Tensorflow library also occupies certain memory)

Thanks.

We have with us the weights .h5 file for the real time object detection and we are loading the weights file directly. Is there any way around to use this for TensorRT?

Hi,

You will need to convert it into ONNX format first.
Thanks.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.