Error in TFTRT

edit_or · June 17, 2020, 3:44am

I am trying to convert my Tensorflow model to TRT following the object detection sample.
My command is
python object_detection.py --input_saved_model_dir /workspace/examples/NumPlateDetection/saved_model --output_saved_model_dir /workspace/examples/NumPlateDetection --data_dir /workspace/examples/NumPlateDetection/infer/images --calib_data_dir /workspace/examples/NumPlateDetection/images --optimize_offline --precision INT8 --num_calib_inputs 800 --input_size 736 --batch_size 8 --mode 'inference' --outputimg_path /workspace/examples/NumPlateDetection/outputs --use_trt

What could be issue?

I have the following errors

: 164 curr_region_allocation_bytes_: 34359738368
2020-06-17 03:38:17.241209: I tensorflow/core/common_runtime/bfc_allocator.cc:970] Stats: 
Limit:                 23469584548
InUse:                 17155265792
MaxInUse:              19082024704
NumAllocs:                    4030
MaxAllocSize:           3833987072

2020-06-17 03:38:17.241298: W tensorflow/core/common_runtime/bfc_allocator.cc:429] *********_______________******************_***********_______***************************************
2020-06-17 03:38:17.241333: E tensorflow/compiler/tf2tensorrt/utils/trt_logger.cc:42] DefaultLogger Requested amount of GPU memory (4404019200 bytes) could not be allocated. There may not be enough free memory for allocation to succeed.
2020-06-17 03:38:17.241362: E tensorflow/compiler/tf2tensorrt/utils/trt_logger.cc:42] DefaultLogger /home/jenkins/workspace/TensorRT/helpers/rel-7.0/L1_Nightly/build/source/rtSafe/resources.h (164) - OutOfMemory Error in GpuMemory: 0
2020-06-17 03:38:17.241455: E tensorflow/compiler/tf2tensorrt/utils/trt_logger.cc:42] DefaultLogger Out of memory error during getBestTactic: (Unnamed Layer* 0) [Shuffle]
2020-06-17 03:38:17.241481: E tensorflow/compiler/tf2tensorrt/utils/trt_logger.cc:42] DefaultLogger Internal error: could not find any implementation for node (Unnamed Layer* 0) [Shuffle], try increasing the workspace size with IBuilder::setMaxWorkspaceSize()
2020-06-17 03:38:17.243854: E tensorflow/compiler/tf2tensorrt/utils/trt_logger.cc:42] DefaultLogger ../builder/tacticOptimizer.cpp (1523) - OutOfMemory Error in computeCosts: 0
2020-06-17 03:38:17.255722: E tensorflow/compiler/tf2tensorrt/kernels/trt_engine_op.cc:841] Calibration failed: Internal: Failed to build TensorRT engine
2020-06-17 03:38:17.255955: W tensorflow/core/common_runtime/base_collective_executor.cc:217] BaseCollectiveExecutor::StartAbort Internal: Failed to feed calibration data
	 [[{{node TRTEngineOp_31}}]]
	 [[SecondStagePostprocessor/map/while/Switch_1/_316]]
2020-06-17 03:38:17.256282: W tensorflow/core/common_runtime/base_collective_executor.cc:217] BaseCollectiveExecutor::StartAbort Internal: Failed to feed calibration data
	 [[{{node TRTEngineOp_31}}]]
Traceback (most recent call last):
  File "numplate_detection.py", line 380, in <module>
    optimize_offline=args.optimize_offline)
  File "numplate_detection.py", line 107, in get_graph_func
    input_fn, calib_data_dir, num_calib_inputs//batch_size))
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/compiler/tensorrt/trt_convert.py", line 1004, in convert
    self._converted_func(*map(ops.convert_to_tensor, inp))
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/eager/function.py", line 1551, in __call__
    return self._call_impl(args, kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/eager/function.py", line 1591, in _call_impl
    return self._call_flat(args, self.captured_inputs, cancellation_manager)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/eager/function.py", line 1692, in _call_flat
    ctx, args, cancellation_manager=cancellation_manager))
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/eager/function.py", line 545, in call
    ctx=ctx)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/eager/execute.py", line 67, in quick_execute
    six.raise_from(core._status_to_exception(e.code, message), None)
  File "<string>", line 3, in raise_from
tensorflow.python.framework.errors_impl.InternalError: 2 root error(s) found.
  (0) Internal:  Failed to feed calibration data
	 [[node TRTEngineOp_31 (defined at numplate_detection.py:107) ]]
	 [[SecondStagePostprocessor/map/while/Switch_1/_316]]
  (1) Internal:  Failed to feed calibration data
	 [[node TRTEngineOp_31 (defined at numplate_detection.py:107) ]]
0 successful operations.
0 derived errors ignored. [Op:__inference_pruned_27865]

Function call stack:
pruned -> pruned

terminate called without an active exception
Aborted (core dumped)

AakankshaS · June 17, 2020, 5:36am

edit_or:

2020-06-17 03:38:17.241333: E tensorflow/compiler/tf2tensorrt/utils/trt_logger.cc:42] DefaultLogger Requested amount of GPU memory (4404019200 bytes) could not be allocated. There may not be enough free memory for allocation to succeed.
2020-06-17 03:38:17.241362: E tensorflow/compiler/tf2tensorrt/utils/trt_logger.cc:42] DefaultLogger /home/jenkins/workspace/TensorRT/helpers/rel-7.0/L1_Nightly/build/source/rtSafe/resources.h (164) - OutOfMemory Error in GpuMemory: 0

You may have to reduce the max workspace size and try below config to limit GPU memory usage by tensorflow.
You can set the fraction of GPU memory to be allocated when you construct a tf.Session by passing a tf.GPUOptions as part of the optional config argument:
gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.333)

Thanks!

edit_or · June 17, 2020, 6:32am

I don’t find that config in the file. Please see the program here.

AakankshaS · June 17, 2020, 7:10am

Hi,
Please use the link below for reference

github.com/tensorflow/tensorflow

Tensorflow v2 Limit GPU Memory usage

opened 01:25PM - 23 Jan 19 UTC

closed 09:19PM - 25 Feb 19 UTC

cassianocasagrande

type:support comp:gpu TF 2.0

Need a way to prevent TF from consuming all GPU memory, on v1, this was done by …using something like: ``` opts = tf.GPUOptions(per_process_gpu_memory_fraction=0.5) sess = tf.Session(config=tf.ConfigProto(gpu_options=opts)) ``` On v2 there is no Session and GPUConfig on tf namespace. **System information** - Have I written custom code (as opposed to using a stock example script provided in TensorFlow): - OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux 18.04.1 (AWS EC2 P2) - TensorFlow installed from (source or binary): pip install tf-nightly-2.0-preview - TensorFlow version (use command below):'1.13.0-dev20190117' - Python version:Python 3.6.5 - CUDA/cuDNN version: 10.0 - GPU model and memory: Tesla K80 12GB

Also, to set the max workspace size
https://github.com/tensorflow/tensorrt/blob/master/tftrt/examples/object_detection/object_detection.py#L360
Thanks!

edit_or · June 17, 2020, 8:48am

Now I understood the two configurations inside the parser, --gpu_mem_cap and --max_workspace_size. --gpu_mem_cap is to cap the gpu memory usage for Tensorflow.

It is implemented inside the code as

def config_gpu_memory(gpu_mem_cap):
  gpus = tf.config.experimental.list_physical_devices('GPU')
  if not gpus:
    return
  print('Found the following GPUs:')
  for gpu in gpus:
    print('  ', gpu)
  for gpu in gpus:
    try:
      if not gpu_mem_cap:
        tf.config.experimental.set_memory_growth(gpu, True)
      else:
        tf.config.experimental.set_virtual_device_configuration(
            gpu,
            [tf.config.experimental.VirtualDeviceConfiguration(
                memory_limit=gpu_mem_cap)])
    except RuntimeError as e:
      print('Can not set GPU memory config', e)

When I set --gpu_mem_cap = 0.3, it is same as you mentioned earlier, I have error as follows.

2020-06-17 08:48:04.947505: I tensorflow/core/common_runtime/bfc_allocator.cc:917] Bin for 256B was 256B, Chunk State: 
2020-06-17 08:48:04.947519: I tensorflow/core/common_runtime/bfc_allocator.cc:955]      Summary of in-use Chunks by size: 
2020-06-17 08:48:04.947533: I tensorflow/core/common_runtime/bfc_allocator.cc:962] Sum Total of in-use chunks: 0B
2020-06-17 08:48:04.947546: I tensorflow/core/common_runtime/bfc_allocator.cc:964] total_region_allocated_bytes_: 0 memory_limit_: 0 available bytes: 0 curr_region_allocation_bytes_: 0
2020-06-17 08:48:04.947573: I tensorflow/core/common_runtime/bfc_allocator.cc:970] Stats: 
Limit:                           0
InUse:                           0
MaxInUse:                        0
NumAllocs:                       0
MaxAllocSize:                    0

2020-06-17 08:48:04.947603: W tensorflow/core/common_runtime/bfc_allocator.cc:429] <allocator contains no memory>
2020-06-17 08:48:04.947665: W tensorflow/core/framework/op_kernel.cc:1632] OP_REQUIRES failed at constant_op.cc:79 : Resource exhausted: OOM when allocating tensor of shape [] and type float
2020-06-17 08:48:04.947718: E tensorflow/core/common_runtime/executor.cc:660] Executor failed to create kernel. Resource exhausted: OOM when allocating tensor of shape [] and type float
	 [[{{node dummy_fetch_0}}]]
Traceback (most recent call last):
  File "numplate_detection.py", line 375, in <module>
    optimize_offline=args.optimize_offline)
  File "numplate_detection.py", line 81, in get_graph_func
    graph_func = get_func_from_saved_model(input_saved_model_dir)
  File "numplate_detection.py", line 57, in get_func_from_saved_model
    saved_model_dir, tags=[tag_constants.SERVING])
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/saved_model/load.py", line 528, in load
    return load_internal(export_dir, tags)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/saved_model/load.py", line 559, in load_internal
    root = load_v1_in_v2.load(export_dir, tags)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/saved_model/load_v1_in_v2.py", line 254, in load
    return loader.load(tags=tags)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/saved_model/load_v1_in_v2.py", line 225, in load
    local_init_op, _ = initializer._initialize()
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/saved_model/load_v1_in_v2.py", line 64, in _initialize
    return self._init_fn(*[path.asset_path for path in self._asset_paths])
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/eager/function.py", line 1551, in __call__
    return self._call_impl(args, kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/eager/function.py", line 1591, in _call_impl
    return self._call_flat(args, self.captured_inputs, cancellation_manager)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/eager/function.py", line 1692, in _call_flat
    ctx, args, cancellation_manager=cancellation_manager))
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/eager/function.py", line 545, in call
    ctx=ctx)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/eager/execute.py", line 67, in quick_execute
    six.raise_from(core._status_to_exception(e.code, message), None)
  File "<string>", line 3, in raise_from
tensorflow.python.framework.errors_impl.ResourceExhaustedError:  OOM when allocating tensor of shape [] and type float
	 [[{{node dummy_fetch_0}}]] [Op:__inference_pruned_3700]

Function call stack:
pruned

root@3c2057756f7c:/workspace/examples/NumPlateDetection#

AakankshaS · June 17, 2020, 10:07am

Request you to share your model and environment/setup details so that we can assist you better.
Thanks!

edit_or · June 18, 2020, 1:24am

Hi Thanks.
Please get tensorfow’s saved model from the link.
The file is big so can’t upload to this page.

My gpu is Titan RTX 24GB and Nvidia’s Tensorflow docker 2.0 is used to test the TFTRT.
This object detection program is used.
https://github.com/tensorflow/tensorrt/blob/master/tftrt/examples/object_detection/object_detection.py#L360

edit_or · June 18, 2020, 8:23am

Did you find any issue on my model?

edit_or · June 22, 2020, 4:18am

Any issue with my model?

AakankshaS · June 22, 2020, 4:40am

Hello @edit_or, the team is looking into this.
We will update you soon.
Thanks for your patience.