Concat OP dimension mismatch causing inference failure on tensorflow-tenosrrt inference pipeline

aatmadeepaarya · February 9, 2024, 9:35am

2024-02-08T18:30:00Z
Please provide the following info (tick the boxes after creating this topic):
Software Version
DRIVE OS 6.0.8.1
DRIVE OS 6.0.6
DRIVE OS 6.0.5
DRIVE OS 6.0.4 (rev. 1)
DRIVE OS 6.0.4 SDK
other
Not known

Target Operating System
Linux
QNX
other

Hardware Platform
DRIVE AGX Orin Developer Kit (940-63710-0010-300)
DRIVE AGX Orin Developer Kit (940-63710-0010-200)
DRIVE AGX Orin Developer Kit (940-63710-0010-100)
DRIVE AGX Orin Developer Kit (940-63710-0010-D00)
DRIVE AGX Orin Developer Kit (940-63710-0010-C00)
DRIVE AGX Orin Developer Kit (not sure its number)
other

SDK Manager Version
1.9.3.10904
other

Host Machine Version
native Ubuntu Linux 20.04 Host installed with SDK Manager
native Ubuntu Linux 20.04 Host installed with DRIVE OS Docker Containers
native Ubuntu Linux 18.04 Host installed with DRIVE OS Docker Containers
other

Library details

Nvidia Jetpack version
nvidia-jetpack-dev/stable,now 5.1-b147 arm64

tensorflow version
2.9.1+nv22.7

tensorrt version
ensorrt/stable,now 8.5.2.2-1+cuda11.4 arm64 [installed,automatic]

CUDA version
nvidia-cuda/stable,now 5.1-b147 arm64 [installed]
nvidia-l4t-cuda/stable,now 35.2.1-20230124153320 arm64 [installed]

Model used:

yolov5s exported saved_format from ultralytics repo

Code to convert the saved_model format to tensorrt

from tensorflow.python.compiler.tensorrt import trt_convert as tf_trt
from tensorflow.python.saved_model import tag_constants
import tensorflow as tf
import tensorrt as trt

class OptimizedModel():
    def __init__(self, saved_model_dir = None):
        self.loaded_model_fn = None
        
        if not saved_model_dir is None:
            self.load_model(saved_model_dir)
            
    
    def predict(self, input_data): 
        if self.loaded_model_fn is None:
            raise(Exception("Haven't loaded a model"))
        x = tf.constant(input_data.astype('float32'))
        labeling = self.loaded_model_fn(x)
        try:
            preds = labeling['predictions'].numpy()
        except:
            try:
                preds = labeling['probs'].numpy()
            except:
                try:
                    preds = labeling[next(iter(labeling.keys()))]
                except:
                    raise(Exception("Failed to get predictions from saved model object"))
        return preds
    
    def load_model(self, saved_model_dir):
        saved_model_loaded = tf.saved_model.load(saved_model_dir, tags=[tag_constants.SERVING])
        wrapper_fp32 = saved_model_loaded.signatures['serving_default']
        
        self.loaded_model_fn = wrapper_fp32


class ModelOptimizer():
    def __init__(self, input_saved_model_dir, calibration_data=None):
        self.input_saved_model_dir = input_saved_model_dir
        self.calibration_data = None
        self.loaded_model = None
        
        if not calibration_data is None:
            self.set_calibration_data(calibration_data)
        
        
    def set_calibration_data(self, calibration_data):
        
        def calibration_input_fn():
            yield (tf.constant(calibration_data.astype('float32')), )
        
        self.calibration_data = calibration_input_fn
        
        
    def convert(self, output_saved_model_dir, precision="FP32", max_workspace_size_bytes=8000000000, **kwargs):
        
        if precision == "INT8" and self.calibration_data is None:
            raise(Exception("No calibration data set!"))

        trt_precision = precision_dict[precision]
        conversion_params = tf_trt.DEFAULT_TRT_CONVERSION_PARAMS._replace(precision_mode=trt_precision, 
                                                                       max_workspace_size_bytes=max_workspace_size_bytes,
                                                                       use_calibration= precision == "INT8")
        converter = tf_trt.TrtGraphConverterV2(input_saved_model_dir=self.input_saved_model_dir,
                                conversion_params=conversion_params)
        
        if precision == "INT8":
            converter.convert(calibration_input_fn=self.calibration_data)
        else:
            converter.convert()
            
        converter.save(output_saved_model_dir=output_saved_model_dir)
        
        return OptimizedModel(output_saved_model_dir)
    
    def predict(self, input_data):

Code that produced the error:

import numpy as np
BATCH_SIZE = 32
dummy_input_batch = np.zeros((BATCH_SIZE, 224, 224, 3))

PRECISION = "FP32" # Options are "FP32", "FP16", or "INT8"

from helper import ModelOptimizer # using the helper from <URL>

model_dir = '/media/$USER/9C33-6BBD/obejct_detection_tracking/yolov5s_saved_model'

opt_model = ModelOptimizer(model_dir)

model_fp32 = opt_model.convert(model_dir+'_FP32', precision=PRECISION)
# TF-TRT essentially yields a Tensorflow graph with some optimized TensorRT operations included in it. We can run this graph with .predict() like we would any other Tensorflow model.
model_fp32.predict(dummy_input_batch)

Error message:

TypeError                                 Traceback (most recent call last)
File /usr/local/lib/python3.8/dist-packages/tensorflow/python/eager/function.py:1611, in ConcreteFunction._call_impl(self, args, kwargs, cancellation_manager)
   1610 try:
-> 1611   return self._call_with_structured_signature(args, kwargs,
   1612                                               cancellation_manager)
   1613 except TypeError as structured_err:

File /usr/local/lib/python3.8/dist-packages/tensorflow/python/eager/function.py:1689, in ConcreteFunction._call_with_structured_signature(self, args, kwargs, cancellation_manager)
   1687 args, kwargs, filtered_flat_args = (
   1688     self._function_spec.canonicalize_function_inputs(*args, **kwargs))
-> 1689 self._structured_signature_check_missing_args(args, kwargs)
   1690 self._structured_signature_check_unexpected_args(args, kwargs)

File /usr/local/lib/python3.8/dist-packages/tensorflow/python/eager/function.py:1708, in ConcreteFunction._structured_signature_check_missing_args(self, args, kwargs)
   1707 if missing_arguments:
-> 1708   raise TypeError(f"{self._structured_signature_summary()} missing "
   1709                   "required arguments: "
   1710                   f"{', '.join(sorted(missing_arguments))}.")

TypeError: signature_wrapper(*, x) missing required arguments: x.

During handling of the above exception, another exception occurred:

InvalidArgumentError                      Traceback (most recent call last)
/media/$USER/9C33-6BBD/obejct_detection_tracking/yolov8n_tf-tensorrt.py in line 2
      86 # %%
----> 87 model_fp32.predict(dummy_input_batch)

File /media/$USER/9C33-6BBD/obejct_detection_tracking/helper.py:45, in OptimizedModel.predict(self, input_data)
     43     raise(Exception("Haven't loaded a model"))
     44 x = tf.constant(input_data.astype('float32'))
---> 45 labeling = self.loaded_model_fn(x)
     46 try:
     47     preds = labeling['predictions'].numpy()

File /usr/local/lib/python3.8/dist-packages/tensorflow/python/eager/function.py:1602, in ConcreteFunction.__call__(self, *args, **kwargs)
   1552 def __call__(self, *args, **kwargs):
   1553   """Executes the wrapped function.
   1554 
   1555   ConcreteFunctions have two signatures:
   (...)
   1600     TypeError: If the arguments do not match the function's signature.
   1601   """
-> 1602   return self._call_impl(args, kwargs)

File /usr/local/lib/python3.8/dist-packages/tensorflow/python/eager/function.py:1615, in ConcreteFunction._call_impl(self, args, kwargs, cancellation_manager)
   1613 except TypeError as structured_err:
   1614   try:
-> 1615     return self._call_with_flat_signature(args, kwargs,
   1616                                           cancellation_manager)
   1617   except TypeError:
   1618     raise structured_err

File /usr/local/lib/python3.8/dist-packages/tensorflow/python/eager/function.py:1669, in ConcreteFunction._call_with_flat_signature(self, args, kwargs, cancellation_manager)
   1664   if not isinstance(
   1665       arg, (ops.Tensor, resource_variable_ops.BaseResourceVariable)):
   1666     raise TypeError(f"{self._flat_signature_summary()}: expected argument "
   1667                     f"#{i}(zero-based) to be a Tensor; "
   1668                     f"got {type(arg).__name__} ({arg}).")
-> 1669 return self._call_flat(args, self.captured_inputs, cancellation_manager)

File /usr/local/lib/python3.8/dist-packages/tensorflow/python/saved_model/load.py:135, in _WrapperFunction._call_flat(self, args, captured_inputs, cancellation_manager)
    133 else:  # cross-replica context
    134   captured_inputs = list(map(get_unused_handle, captured_inputs))
--> 135 return super(_WrapperFunction, self)._call_flat(args, captured_inputs,
    136                                                 cancellation_manager)

File /usr/local/lib/python3.8/dist-packages/tensorflow/python/eager/function.py:1860, in ConcreteFunction._call_flat(self, args, captured_inputs, cancellation_manager)
   1856 possible_gradient_type = gradients_util.PossibleTapeGradientTypes(args)
   1857 if (possible_gradient_type == gradients_util.POSSIBLE_GRADIENT_TYPES_NONE
   1858     and executing_eagerly):
   1859   # No tape is watching; skip to running the function.
-> 1860   return self._build_call_outputs(self._inference_function.call(
   1861       ctx, args, cancellation_manager=cancellation_manager))
   1862 forward_backward = self._select_forward_and_backward_functions(
   1863     args,
   1864     possible_gradient_type,
   1865     executing_eagerly)
   1866 forward_function, args_with_tangents = forward_backward.forward()

File /usr/local/lib/python3.8/dist-packages/tensorflow/python/eager/function.py:497, in _EagerDefinedFunction.call(self, ctx, args, cancellation_manager)
    495 with _InterpolateFunctionError(self):
    496   if cancellation_manager is None:
--> 497     outputs = execute.execute(
    498         str(self.signature.name),
    499         num_outputs=self._num_outputs,
    500         inputs=args,
    501         attrs=attrs,
    502         ctx=ctx)
    503   else:
    504     outputs = execute.execute_with_cancellation(
    505         str(self.signature.name),
    506         num_outputs=self._num_outputs,
   (...)
    509         ctx=ctx,
    510         cancellation_manager=cancellation_manager)

File /usr/local/lib/python3.8/dist-packages/tensorflow/python/eager/execute.py:54, in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
     52 try:
     53   ctx.ensure_initialized()
---> 54   tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
     55                                       inputs, attrs, num_outputs)
     56 except core._NotOkStatusException as e:
     57   if name is not None:

InvalidArgumentError: Graph execution error:

Detected at node 'PartitionedCall/PartitionedCall/model/tf_concat/concat' defined at (most recent call last):
    File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
      return _run_code(code, main_globals, None,
    File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
      exec(code, run_globals)
    File "/home/$USER/.local/lib/python3.8/site-packages/ipykernel_launcher.py", line 17, in <module>
      app.launch_new_instance()
    File "/home/$USER/.local/lib/python3.8/site-packages/traitlets/config/application.py", line 1075, in launch_instance
      app.start()
    File "/home/$USER/.local/lib/python3.8/site-packages/ipykernel/kernelapp.py", line 739, in start
      self.io_loop.start()
    File "/home/$USER/.local/lib/python3.8/site-packages/tornado/platform/asyncio.py", line 205, in start
      self.asyncio_loop.run_forever()
    File "/usr/lib/python3.8/asyncio/base_events.py", line 570, in run_forever
      self._run_once()
    File "/usr/lib/python3.8/asyncio/base_events.py", line 1859, in _run_once
      handle._run()
    File "/usr/lib/python3.8/asyncio/events.py", line 81, in _run
      self._context.run(self._callback, *self._args)
    File "/home/$USER/.local/lib/python3.8/site-packages/ipykernel/kernelbase.py", line 542, in dispatch_queue
      await self.process_one()
    File "/home/$USER/.local/lib/python3.8/site-packages/ipykernel/kernelbase.py", line 531, in process_one
      await dispatch(*args)
    File "/home/$USER/.local/lib/python3.8/site-packages/ipykernel/kernelbase.py", line 437, in dispatch_shell
      await result
    File "/home/$USER/.local/lib/python3.8/site-packages/ipykernel/ipkernel.py", line 359, in execute_request
      await super().execute_request(stream, ident, parent)
    File "/home/$USER/.local/lib/python3.8/site-packages/ipykernel/kernelbase.py", line 775, in execute_request
      reply_content = await reply_content
    File "/home/$USER/.local/lib/python3.8/site-packages/ipykernel/ipkernel.py", line 446, in do_execute
      res = shell.run_cell(
    File "/tmp/ipykernel_21293/962221442.py", line 29, in wrapper
      result = old_func(*args, **kwargs)
    File "/home/$USER/.local/lib/python3.8/site-packages/ipykernel/zmqshell.py", line 549, in run_cell
      return super().run_cell(*args, **kwargs)
    File "/home/$USER/.local/lib/python3.8/site-packages/IPython/core/interactiveshell.py", line 3009, in run_cell
      result = self._run_cell(
    File "/home/$USER/.local/lib/python3.8/site-packages/IPython/core/interactiveshell.py", line 3064, in _run_cell
      result = runner(coro)
    File "/home/$USER/.local/lib/python3.8/site-packages/IPython/core/async_helpers.py", line 129, in _pseudo_sync_runner
      coro.send(None)
    File "/home/$USER/.local/lib/python3.8/site-packages/IPython/core/interactiveshell.py", line 3269, in run_cell_async
      has_raised = await self.run_ast_nodes(code_ast.body, cell_name,
    File "/home/$USER/.local/lib/python3.8/site-packages/IPython/core/interactiveshell.py", line 3448, in run_ast_nodes
      if await self.run_code(code, result, async_=asy):
    File "/home/$USER/.local/lib/python3.8/site-packages/IPython/core/interactiveshell.py", line 3508, in run_code
      exec(code_obj, self.user_global_ns, self.user_ns)
    File "<ipython-input-12-ec7d8c4960ad>", line 2, in <module>
      model_fp32 = opt_model.convert(model_dir+'_FP32', precision=PRECISION)
    File "/media/$USER/9C33-6BBD/obejct_detection_tracking/helper.py", line 101, in convert
      return OptimizedModel(output_saved_model_dir)
    File "/media/$USER/9C33-6BBD/obejct_detection_tracking/helper.py", line 38, in __init__
      self.load_model(saved_model_dir)
    File "/media/$USER/9C33-6BBD/obejct_detection_tracking/helper.py", line 59, in load_model
      saved_model_loaded = tf.saved_model.load(saved_model_dir, tags=[tag_constants.SERVING])
Node: 'PartitionedCall/PartitionedCall/model/tf_concat/concat'
ConcatOp : Dimension 1 in both shapes must be equal: shape[0] = [32,40,40,256] vs. shape[1] = [32,14,14,256]
	 [[{{node PartitionedCall/PartitionedCall/model/tf_concat/concat}}]]
	 [[PartitionedCall/TRTEngineOp_000_000]] [Op:__inference_signature_wrapper_4880]

AastaLLL · February 15, 2024, 6:31am

Hi,

Which device do you use? Is it DRIVE or Jetson?
Based on the error:

ConcatOp : Dimension 1 in both shapes must be equal: shape[0] = [32,40,40,256] vs. shape[1] = [32,14,14,256]

It looks like your model tries to concatenate two tensors with difference sizes.
Have you tried to inference the model with other frameworks?

Thanks.

aatmadeepaarya · March 8, 2024, 6:12am

Jetson AGX orin 64GB. Since the model is tensorrt optimized I was trying to run it with tensorflow. Are there any other ways I can get this model running on AGX orin?

system · March 22, 2024, 6:13am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
TensorRT-LLM on Jetson Orin NX(16GB) Jetson Orin NX tensorrt , jetson-inference , generative_ai	9	409	February 12, 2025
Failed to flash Jetson Linux -- Debug Log posted Jetson AGX Orin reflash	5	698	June 14, 2023
Benchmark tensorrt model in UFF extension on jetson AGX Orin Jetson AGX Orin jetson-inference	6	158	July 3, 2024
Keras->Onnx->TensorRT Jetson AGX Orin tensorrt	4	144	September 25, 2024
TensorRT 8.6.1 on Drive OS Docker - CudaMemcpyAsync Invalid Argument DRIVE AGX Orin General driveos-dl	3	472	February 8, 2024
Input shapes do not match input partial shapes stored in graph TensorRT cudnn	1	285	May 27, 2024
Jetson-inference: mageNet.Classify() encountered an error Jetson Nano	14	1476	October 14, 2021
Create inference graph failed on Agx Xavier Jetson AGX Xavier	32	2076	October 18, 2021
Process killed during tensorrt conversion on Jetson orin NX (8 GB) Jetson Orin NX tensorrt	15	741	April 30, 2024
Drive AGX Orin TensorRT inference failed DRIVE AGX Orin General driveos-dl	25	990	September 14, 2023