I am running an application that employs a Keras-TensorFlow model to perform object detection. This model runs in tandem with a Caffe model that performs facial detection/recognition.
The application runs well on a laptop but when I run it on my Jetson Nano it crashes almost immediately. Below is the last part of the console output which I think shows that there’s a memory insufficiency (assuming OOM == out of memory).
Perhaps there’s a way to configure my system and/or the TensorFlow settings so that this is no longer an issue? Is there another way around this, perhaps by converting the model to run on TensorFlow Lite?
If anyone can give me some guidance as to how I can troubleshoot this issue further then please advise, thanks in advance for your kind help!
2019-05-21 13:22:47.917271: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Free at 0xf09325900 of size 256
2019-05-21 13:22:47.917300: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0xf09325a00 of size 256
2019-05-21 13:22:47.917332: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0xf09325b00 of size 16074752
2019-05-21 13:22:47.917364: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0xf0a27a300 of size 64299008
2019-05-21 13:22:47.917392: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0xf0dfcc300 of size 116251904
2019-05-21 13:22:47.917418: I tensorflow/core/common_runtime/bfc_allocator.cc:638] Summary of in-use Chunks by size:
2019-05-21 13:22:47.917512: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 82 Chunks of size 256 totalling 20.5KiB
2019-05-21 13:22:47.917562: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 34 Chunks of size 512 totalling 17.0KiB
2019-05-21 13:22:47.917595: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 82 Chunks of size 1024 totalling 82.0KiB
2019-05-21 13:22:47.917650: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 1 Chunks of size 1280 totalling 1.2KiB
2019-05-21 13:22:47.917684: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 46 Chunks of size 2048 totalling 92.0KiB
2019-05-21 13:22:47.917715: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 30 Chunks of size 4096 totalling 120.0KiB
2019-05-21 13:22:47.917748: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 18 Chunks of size 8192 totalling 144.0KiB
2019-05-21 13:22:47.917780: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 1 Chunks of size 16384 totalling 16.0KiB
2019-05-21 13:22:47.917812: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 1 Chunks of size 37632 totalling 36.8KiB
2019-05-21 13:22:47.917844: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 6 Chunks of size 65536 totalling 384.0KiB
2019-05-21 13:22:47.917875: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 1 Chunks of size 82944 totalling 81.0KiB
2019-05-21 13:22:47.917907: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 1 Chunks of size 131072 totalling 128.0KiB
2019-05-21 13:22:47.917941: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 3 Chunks of size 147456 totalling 432.0KiB
2019-05-21 13:22:47.917975: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 7 Chunks of size 262144 totalling 1.75MiB
2019-05-21 13:22:47.918013: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 1 Chunks of size 331776 totalling 324.0KiB
2019-05-21 13:22:47.918049: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 3 Chunks of size 524288 totalling 1.50MiB
2019-05-21 13:22:47.918081: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 4 Chunks of size 589824 totalling 2.25MiB
2019-05-21 13:22:47.918112: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 12 Chunks of size 1048576 totalling 12.00MiB
2019-05-21 13:22:47.918142: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 3 Chunks of size 2097152 totalling 6.00MiB
2019-05-21 13:22:47.918174: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 18 Chunks of size 2359296 totalling 40.50MiB
2019-05-21 13:22:47.918203: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 5 Chunks of size 4194304 totalling 20.00MiB
2019-05-21 13:22:47.918233: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 1 Chunks of size 8388608 totalling 8.00MiB
2019-05-21 13:22:47.918263: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 3 Chunks of size 9437184 totalling 27.00MiB
2019-05-21 13:22:47.918294: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 1 Chunks of size 16074752 totalling 15.33MiB
2019-05-21 13:22:47.918325: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 1 Chunks of size 18874368 totalling 18.00MiB
2019-05-21 13:22:47.918356: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 1 Chunks of size 64299008 totalling 61.32MiB
2019-05-21 13:22:47.918389: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 1 Chunks of size 116251904 totalling 110.87MiB
2019-05-21 13:22:47.918419: I tensorflow/core/common_runtime/bfc_allocator.cc:645] Sum Total of in-use chunks: 326.35MiB
2019-05-21 13:22:47.918454: I tensorflow/core/common_runtime/bfc_allocator.cc:647] Stats:
Limit: 342204416
InUse: 342204160
MaxInUse: 342204160
NumAllocs: 715
MaxAllocSize: 116251904
2019-05-21 13:22:47.918508: W tensorflow/core/common_runtime/bfc_allocator.cc:271] *************************************************************************************xxxxxxxxxxxxxxx
2019-05-21 13:22:47.937473: W tensorflow/core/framework/op_kernel.cc:1401] OP_REQUIRES failed at conv_ops.cc:735 : Resource exhausted: OOM when allocating tensor with shape[256,64,1,1] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
Traceback (most recent call last):
File "/home/james/.virtualenvs/nano/bin/monitor_video", line 10, in <module>
sys.exit(monitor_video())
File "/home/james/.virtualenvs/nano/lib/python3.6/site-packages/deep_monitor/__main__.py", line 299, in monitor_video
_monitor(args["config"])
File "/home/james/.virtualenvs/nano/lib/python3.6/site-packages/deep_monitor/__main__.py", line 229, in _monitor
detections = detector_object.detect(frame, confidence_object)
File "/home/james/.virtualenvs/nano/lib/python3.6/site-packages/deep_monitor/detector.py", line 121, in detect
(boxes, scores, labels) = self.model.predict_on_batch(image)
File "/home/james/.virtualenvs/nano/lib/python3.6/site-packages/keras/engine/training.py", line 1274, in predict_on_batch
outputs = self.predict_function(ins)
File "/home/james/.virtualenvs/nano/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2715, in __call__
return self._call(inputs)
File "/home/james/.virtualenvs/nano/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2675, in _call
fetched = self._callable_fn(*array_vals)
File "/home/james/.virtualenvs/nano/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1439, in __call__
run_metadata_ptr)
File "/home/james/.virtualenvs/nano/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 528, in __exit__
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[256,64,1,1] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[{{node res2a_branch2c/convolution}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
[[{{node filtered_detections/map/while/PadV2_2/paddings}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
FATAL: exception not rethrown