Triton ensemble model output wrong values when retrieving result from buffer through python bindings

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU) GPU
• DeepStream Version 6.2
• JetPack Version (valid for Jetson only)
• TensorRT Version
• NVIDIA GPU Driver Version (valid for GPU only) 525.85.12
• Issue Type( questions, new requirements, bugs) bug
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)

I have a custom ensemble model on triton. I am trying to write a custom parser like in the deepstream_ssd_parser example. my ensemble model outputs two layers : [-1, 7] and [-1] where -1 is the number of objects detected depending on the count in the postprocess. it is usually around [30, 7]

When running deepstream with triton inference server, i can confirm that triton is returning the correct values by printing the output through my triton python model script.

However, when i try to get the same output through the output tensor meta in deepstream through the python bindings function pyds.get_detection(), i am getting very high/low values that does not match my trition model output. The wrong values are similar to the numbers here

Wrong values in model output using gRPC Here it was mentioned that it is a bug to be fixed in the next release.

• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)

What is the data type of the two output layers of your model?

Hi Fiona, thanks for getting back

One layer is FP16, the other is UINT16. I could get the correct UINT16 value from deepstream (verified to match with the triton output), however the FP16 layer is giving erronous values.

I solved the problem by changing the output layer data type to FP32 instead of FP16. Thanks for pointing to that direction.

It is unexpected because in my custom triton ensemble postprocessing, I was working with numpy and ensured it was in FP16 by doing np.astype() right before I send it as inference output.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.