How to parse dynamic size tensor output meta

Hi!

I want to parse the output metadata from my custom model. Currently, for a fixed output size, in the pgie probe I have:
SHAPE= [10, 4]
Ptr = ctypes.cast( pyds.get_ptr(Layer.buffer), ctypes.POINTER(ctypes.c_float) )
Data = np.ctypeslib.as_array(Ptr, shape=( SHAPE ) )

What about if now I have a dynamic output size for the first dimension? How I’d figure out the size of the output metadata on runtime? I’ve tried -1 in shape like this SHAPE= [-1, 4] but NumPy doesn’t accept it.

Thanks!

Hi,

Since Deesptream uses TensorRT as the backend, could you verify if your model with dynamic output is supported by TensorRT first?
You can test this with trtexec binary which is located at /usr/src/tensorrt/bin.

Thanks.

Hi @AastaLLL

From what I read about trtexec, I must have an ONNX model as input.
Currently, we are using TensorFlow models with Triton Server.
We have saved_model.pb files in Triton folder structures.

Trying to solve my problem, I was adding a -1 as the first dimension of the config.pbtxt of the Triton file.
My actual Triton config is much more complex, it uses an ensemble of models with pre/post-processing.
The variable batch size is because of the post-processing algorithms.
The simplified config.pbtxt would look something like this for the last model in the ensemble:

name: "model_name"
platform: "tensorflow_savedmodel"
max_batch_size: 128
input [
  {
    name: "input"
    data_type: TYPE_UINT8
#Here, I add -1 to create a variable batch size
    dims: [ -1, 512, 512, 3 ]
  }
]
output [
  {
    name: "boxes"
    data_type: TYPE_FP32
#Here, I add -1 to create a variable batch size
    dims: [ -1, 4 ]
  }
]

I think this may work, I didn’t have errors from Triton, but I did have a segfault casting the output tensor metadata.

How do you know on runtime the shape of the memory metadata to cast when parsing in a callback probe function?

Thanks!

Hi,

The previous suggestion is for using native TensorRT.
Using the Triton server is different.

Would you mind sharing your model and configure with us?
So we can test it directly in our environment?

Thanks.

Hi @AastaLLL!
Sorry, I can’t share the model. It is from my company.

I think we should be able to discuss this without the model.
Just suppose it’s a custom post-processing model running in Triton.
The output node is called “output” and the dimension is [X, 100, 100]
where the X a is dynamic value.
In the case X is a constant, I know how to cast it, this works:

SHAPE= [1, 100, 100]
Ptr = ctypes.cast( pyds.get_ptr(Layer.buffer), ctypes.POINTER(ctypes.c_float) )
Data = np.ctypeslib.as_array(Ptr, shape=( SHAPE ) )
# Now I have my tensor metadata accesible in a Numpy array!

What if SHAPE= [X, 100, 100]?
Does DeepStream provides a mechanism to know the value X on each inference?

Hi,

You can try to get the tensor output dimension via pyds.NvDsInferLayerInfo.
https://docs.nvidia.com/metropolis/deepstream/python-api/PYTHON_API/NvDsInfer/NvDsInferLayerInfoDoc.html

For example, setting X into the maximal possible value.
And use the runtime value from pyds.NvDsInferLayerInfo.dim to decide the valid data.

Thanks.

1 Like