tensorflow.python.framework.errors_impl.ResourceExhaustedError

feriel.zoghlami · September 28, 2020, 9:46am

I try to load a trained model using the function “keras.model.load_model” but I get an OOM error. My model is of memory 235 MB. This is the error trace:
…
8] 1 Chunks of size 35651584 totalling 34.00MiB
2020-09-28 11:44:30.540303: I tensorflow/core/common_runtime/bfc_allocator.cc:1002] Sum Total of in-use chunks: 283.36MiB
2020-09-28 11:44:30.540323: I tensorflow/core/common_runtime/bfc_allocator.cc:1004] total_region_allocated_bytes_: 298536960 memory_limit_: 298536960 available bytes: 0 curr_region_allocation_bytes_: 597073920
2020-09-28 11:44:30.565096: I tensorflow/core/common_runtime/bfc_allocator.cc:1010] Stats:
Limit: 298536960
InUse: 297128960
MaxInUse: 298171136
NumAllocs: 1582
MaxAllocSize: 35651584

2020-09-28 11:44:30.565202: W tensorflow/core/common_runtime/bfc_allocator.cc:439] xxxx**************************xx
2020-09-28 11:44:30.565399: W tensorflow/core/framework/op_kernel.cc:1753] OP_REQUIRES failed at cwise_ops_common.h:134 : Resource exhausted: OOM when allocating tensor with shape[3,3,128,256] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
Traceback (most recent call last):
File “”, line 1, in
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/saving/save.py”, line 184, in load_model
return hdf5_format.load_model_from_hdf5(filepath, custom_objects, compile)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/saving/hdf5_format.py”, line 178, in load_model_from_hdf5
custom_objects=custom_objects)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/saving/model_config.py”, line 55, in model_from_config
return deserialize(config, custom_objects=custom_objects)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/layers/serialization.py”, line 109, in deserialize
printable_module_name=‘layer’)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/utils/generic_utils.py”, line 373, in deserialize_keras_object
list(custom_objects.items())))
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/network.py”, line 987, in from_config
config, custom_objects)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/network.py”, line 2029, in reconstruct_from_config
process_node(layer, node_data)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/network.py”, line 1977, in process_node
output_tensors = layer(input_tensors, **kwargs)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer.py”, line 897, in call
self._maybe_build(inputs)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer.py”, line 2416, in _maybe_build
self.build(input_shapes) # pylint:disable=not-callable
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/layers/convolutional.py”, line 166, in build
dtype=self.dtype)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer.py”, line 577, in add_weight
caching_device=caching_device)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/tracking/base.py”, line 743, in _add_variable_with_custom_getter
**kwargs_for_getter)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer_utils.py”, line 141, in make_variable
shape=variable_shape if variable_shape else None)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/variables.py”, line 259, in call
return cls._variable_v1_call(*args, **kwargs)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/variables.py”, line 220, in _variable_v1_call
shape=shape)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/variables.py”, line 198, in
previous_getter = lambda **kwargs: default_variable_creator(None, **kwargs)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/variable_scope.py”, line 2598, in default_variable_creator
shape=shape)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/variables.py”, line 263, in call
return super(VariableMetaclass, cls).call(*args, **kwargs)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/resource_variable_ops.py”, line 1434, in init
distribute_strategy=distribute_strategy)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/resource_variable_ops.py”, line 1567, in _init_from_args
initial_value() if init_from_fn else initial_value,
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer_utils.py”, line 121, in
init_val = lambda: initializer(shape, dtype=dtype)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/init_ops_v2.py”, line 558, in call
return self._random_generator.random_uniform(shape, -limit, limit, dtype)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/init_ops_v2.py”, line 1068, in random_uniform
shape=shape, minval=minval, maxval=maxval, dtype=dtype, seed=self.seed)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/random_ops.py”, line 301, in random_uniform
result = math_ops.add(result * (maxval - minval), minval, name=name)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/math_ops.py”, line 984, in binary_op_wrapper
return func(x, y, name=name)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/math_ops.py”, line 1283, in _mul_dispatch
return gen_math_ops.mul(x, y, name=name)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/gen_math_ops.py”, line 6089, in mul
_ops.raise_from_not_ok_status(e, name)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py”, line 6653, in raise_from_not_ok_status
six.raise_from(core._status_to_exception(e.code, message), None)
File “”, line 3, in raise_from
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[3,3,128,256] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [Op:Mul]

Any help please?

AastaLLL · September 29, 2020, 2:45am

Hi,

To deploy a model, you will also need some memory for input/output/intermediate tensor.
As a result, the real required memory is much more than the model itself.

Would you mind to check the memory status with tegrastats and share with us?

$ sudo tegrastats

Please also try the configure shared below to see if helps.

- TFv1.15

- TFv2.x

Thanks.

feriel.zoghlami · September 30, 2020, 8:25am

thank you for your reply. I tried the configuration you mentioned and now I could load the model but when I try to predict it give me this error:
W tensorflow/core/common_runtime/bfc_allocator.cc:245] Allocator (GPU_0_bfc) ran out of memory trying to allocate 1,05GiB with freed_by_count=0. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.

this error is displayed several times in a loop and after some seconds my jetson nano is powered off.

this is the output from the command sudo tegrastats:
…
RAM 1860/3956MB (lfb 101x4MB) SWAP 652/10100MB (cached 21MB) IRAM 0/252kB(lfb 252kB) CPU [21%@518,14%@518,19%@518,20%@403] EMC_FREQ 3%@1600 GR3D_FREQ 0%@153 APE 25 PLL@42C CPU@43C PMIC@100C GPU@41.5C AO@46.5C thermal@42.25C POM_5V_IN 2297/2341 POM_5V_GPU 82/93 POM_5V_CPU 328/321
RAM 1861/3956MB (lfb 101x4MB) SWAP 652/10100MB (cached 21MB) IRAM 0/252kB(lfb 252kB) CPU [16%@518,15%@518,14%@518,12%@518] EMC_FREQ 3%@1600 GR3D_FREQ 0%@153 APE 25 PLL@42C CPU@43C PMIC@100C GPU@41.5C AO@46.5C thermal@42.25C POM_5V_IN 2256/2330 POM_5V_GPU 82/92 POM_5V_CPU 328/322

PS: just for you info, when I run my scrip which load the keras model and do prediction I open the system monitor>ressources and I see that the memory usage reachs 3.5GB when the code through the out of memory error.

AastaLLL · October 14, 2020, 4:16am

Hi,

Sorry for the late.

The log indicates the memory on Nano is not large enough for a more efficient algorithm.
This is expected since Nano only have 4G memory. Some algorithm will be limited if it consumes too much memory.

Thanks.

feriel.zoghlami · October 14, 2020, 8:05am

Hi,

what would be the best solution then in this case?

AastaLLL · October 15, 2020, 2:56am

Hi,

This will have some performance issue but it should be okay to inference it.
May I know which power supply are you using first?

Thanks.

feriel.zoghlami · October 15, 2020, 8:34am

Hi,

I am using the mini USB port to power the board with O 5.1V/2.5A

AastaLLL · October 16, 2020, 2:44am

Hi,

Would you mind using the 5W mode to see if TensorFlow can work first?
https://docs.nvidia.com/jetson/l4t/index.html#page/Tegra%2520Linux%2520Driver%2520Package%2520Development%2520Guide%2Fpower_management_nano.html%23wwpID0E02K0HA

$ sudo nvpmodel -m 1
$ sudo jetson_clocks

If yes, the root cause is power starvation. Please check the following topic for more information:

Thanks.

Topic		Replies	Views
Tensorflow crash when making an inference on Jetson Nano Jetson Nano jetpack , cuda , tensorflow	2	776	October 18, 2021
resource exhausted error using tensorflow on jetson nano Jetson Nano	6	4313	October 14, 2021
Whith great dificulty I was able to insatll all packages. Now im trying to run a mask rcnn .h5 file(250). Getting memory error Jetson Nano	2	863	October 15, 2021
Memory Leak Jetson Nano Jetson Nano	5	2165	October 14, 2021
Out of memory error from TensorFlow: any workaround for this, or do I just need a bigger boat? Jetson Nano	11	14146	June 12, 2020
TensorFlow Models : GPU Out of Memory Jetson Nano	3	2737	November 5, 2019
Query Regarding Memory Expansion for Jetson Nano Jetson Nano cuda , cudnn , jetson-nano	4	240	April 24, 2024
SSD: functioned well on CPU but failed on GPU Jetson TX2	7	853	October 18, 2021
ResourceExhaustedError: OOM when allocating tensor with shape[128,8,21].... Frameworks tensorflow	5	4140	December 20, 2019
Power error while using TensorFlow Jetson Nano tensorflow , power , jetson-inference	2	1463	August 29, 2021

tensorflow.python.framework.errors_impl.ResourceExhaustedError

Related topics