Shape Error With External Trition Server Using GRPC

• Hardware Platform (GPU)
• DeepStream Version 6.0/6.1
• TensorRT Version 8.2.5-1
• NVIDIA GPU Driver Version 515.43.04/ Cuda 11.7
• Issue Type( questions, new requirements, bugs)
• How to reproduce the issue? Use an external Triton server for inference.

This has been tested on multiple versions of Cuda/Drivers.

As far as I am aware, currently the bug we are facing is due to switching to the grpc url for an external triton server. This only happens on the models that have a reshape in the config.pbtxt, the reshape adds an extra dimension which is needed to satisfy the triton server, otherwise it won’t accept the config. I believe this then confuses the Deepstream Triton client, as it attempts to preprocess the input shape to match the aforementioned reshape, when the model is only expecting 3 dimensions for input.

Classifier error:

ERROR: infer_grpc_client.cpp:342 inference failed with error: unexpected shape for input ‘INPUT__0’ for model ‘jigsaw_contraction’. Expected [3,448,448], got [1,3,448,448]

Detector error:

python3: infer_cuda_utils.cpp:86: nvdsinferserver::CudaTensorBuf::CudaTensorBuf(const nvdsinferserver::InferDims&, nvdsinferserver::InferDataType, int, const string&, nvdsinferserver::InferMemType, int, bool): Assertion `!hasWildcard(dims)’ failed.

This error does not appear within the default Triton server set-up inside of Deepstream.

The error also does not appear when I use my own client to communicate with the server and set the shape myself.
For example:

service_pb2.ModelInferRequest().InferInputTensor().shape.extend([3, 448, 448])

Can you provide your model, triton server configurations and deepstream nvinferserver configurations for us to reproduce the problem?

Apologies for the delay, had to double check to make sure I could share these, however the model is too large to upload here, hopefully google drive is fine with you guys.

pbtxt: config.pbtxt (530 Bytes)
inferserver config: jigsaw_contraction_inferserver.txt (857 Bytes)
post processor: libnvdsinfer_custom_impl_act.so (46.8 KB)
model: model.onnx - Google Drive
example python script that works: triton_request.py (1.3 KB)

For the external Triton server I launch it with the docker command inside the model repo dir:

docker run --gpus=1 --rm --net=host -v $(pwd):/models nvcr.io/nvidia/tritonserver:22.04-py3 tritonserver --model-repository=/models

Hi, @MadisonL .
I test your model and config file with the gst-lunch-1.0 command, it works well in my env(deepstream 6.1).
You can refer the link Gst-nvinferserver, we suggest you use the NGC Container 22.03. Use the command below to start server

docker run --gpus all -it --rm --shm-size=1g --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 -p 8880:8000 -p 8881:8001 -p 8882:8002 -v $(pwd):/models nvcr.io/nvidia/tritonserver:22.03-py3 tritonserver --model-repository=/models --strict-model-config=false --grpc-infer-allocation-pool-size=16 --log-verbose=1

Also you can test it with the command below in your deepstream 6.1

gst-launch-1.0 uridecodebin uri=file:///opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4 ! mux.sink_0 nvstreammux name=mux width=1920 height=1080 batch-size=1 ! nvinferserver config-file-path=./jigsaw_contraction_inferserver.txt ! nvvideoconvert ! fakesink

Thanks for the idea to try the using the cli, unfortunately I still seem to run into the same issue when I try to launch the model as a sgie.

gst-launch-1.0 uridecodebin uri=file:///opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4 ! mux.sink_0 nvstreammux name=mux width=1920 height=1080 batch-size=1 ! nvinfer config-file-path=/opt/nvidia/deepstream/deepstream-6.1/samples/configs/deepstream-app/config_infer_primary.txt ! nvinferserver config-file-path=./config/jigsaw_contraction_inferserver.txt ! nvvideoconvert ! fakesink
ERROR: infer_grpc_client.cpp:342 inference failed with error: unexpected shape for input 'INPUT__0' for model 'jigsaw_contraction'. Expected [3,448,448], got [1,3,448,448]

Any ideas?

Can you remove “reshape { shape: [ 1, 3, 448, 448 ] }” in your config.pbtxt file?

If I do that, the Triton server fails to launch

model_repository_manager.cc:1234] failed to load 'jigsaw_contraction' version 1: Invalid argument: model 'jigsaw_contraction', tensor 'INPUT__0': the model expects 4 dimensions (shape [1,3,448,448]) but the model configuration specifies 3 dimensions (shape [3,448,448])

Just to reiterate, the model with this configuration works completely fine with the internal Deepstream Triton server ie.

      model_repo {
        root: "/path/to/repo"
        strict_model_config: true
        tf_gpu_memory_fraction: 0.0
        tf_disable_soft_placement: 0
      }

The reshape error only presents itself with the external server.

        grpc {
        url: "0.0.0.0:8001"
       }

I have found a solution for this specific issue.

It involves removing the format: FORMAT_NCHW from the config.pbtxt and changing the dims to dims: [ 1, 3, 448, 448 ].

As long as I specify tensor_order: TENSOR_ORDER_LINEAR in the infer config, the model loads fine and inference is no longer causing a shape error.

Thank you for sharing us the solution!