Must specify a non-zero or non-empty correlation ID - Triton with sequence batching, Tensorrt

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU) GPU
• DeepStream Version 6.1.1
• TensorRT Version 8.5.1
• NVIDIA GPU Driver Version (valid for GPU only) 520.61.05
• Issue Type( questions, new requirements, bugs) question
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)

I am running an inference pipeline in Deepstream that includes an ensemble model built from:

  1. Python model - preprocessor
  2. Tensorrt RCNN model
  3. Python model - postprocessor

It is working as planned except for state management of the RCNN. I read here:

It mentions using the following configuration in the model’s config file:


sequence_batching {
  state [
    {
      input_name: "PreviousState" # layer name of the network for state input
      output_name: "leaky_re_lu_47"  #  This is the state output from, the network
      data_type: TYPE_FP16
      dims: [ -1 ]
      initial_state: {
       data_type: TYPE_FP16
       dims: [ 1 ]
       zero_data: true
       name: "initial state"
      }
    }
  ]
}

But when I run the pipeline I get the error:

ERROR: infer_trtis_server.cpp:259 Triton: TritonServer response error received., triton_err_str:Invalid argument, err_msg:in ensemble 'ensemble_python_smoke_16', inference request to model 'smoke_16' must specify a non-zero or non-empty correlation ID

Am I missing something in the config to cause this?
PLEASE HELP

Guy

The error has shown you the place. Please check the"ensemble_python_smoke_16" config.

Hello and thanks for the answer

The question is what should I specify in the python model config file for it to generate sequence id’s?

Where and how did you config smoke_16?

My complete config file for it:

name: "smoke_16"
platform: "tensorrt_plan"
max_batch_size: 0 #TODO: change according to dynamic batch size
default_model_filename: "model_smoke_16.onnx.engine"

input [
  {
    name: "Image"
    data_type: TYPE_FP16
    dims: [1, 512, 512, 6]
  },
  {
    name: "InitVector"
    data_type: TYPE_FP16
    dims: [1, 1, 1, 180]
  }
]

output [
  {
    name: "detection_machine_2"
    data_type: TYPE_FP16
    dims: [1, 256, 256, 1]
  }
]

instance_group [
  {
    kind: KIND_GPU
    count: 1
    gpus: 0
  }
]

sequence_batching {
  state [
    {
      input_name: "PreviousState"
      output_name: "leaky_re_lu_47"
      data_type: TYPE_FP16
      dims: [1, 128, 128, 180]
      
      initial_state: {
       data_type: TYPE_FP16
       dims: [1, 128, 128, 180]
       zero_data: true
       name: "initial state"
      }
    }
  ]
}


Where and how is the “ensemble_python_smoke_16” configured?

The full config file:

name: "ensemble_python_smoke_16"
platform: "ensemble"
max_batch_size: 0 #TODO: set to 256 when dynamic batcher is enabled
input 
[
  {
    name: "INPUT"
    data_type: TYPE_UINT8
    dims: [512, 512, 3]
  }
]
output 
[
  {
    name: "OUTPUT"
    data_type: TYPE_FP16
    dims: [1, 256, 256, 1]
  }
]

ensemble_scheduling 
{
  step 
  [
    {
      model_name: "preprocess_16"
      model_version: 1
      input_map 
      {
        key: "INPUT"
        value: "INPUT"
      }
      output_map
      {
        key: "Image"
        value: "preprocessed_image"
      }
      output_map 
      {
        key: "InitVector"
        value: "InitVector"
      }
    }
    ,
    {
      model_name: "smoke_16"
      model_version: 1

      input_map 
      {
        key: "Image"
        value: "preprocessed_image"
      }
      input_map
      {
        key: "InitVector"
        value: "InitVector"
      }
      output_map
      {
        key: "detection_machine_2"
        value: "OUTPUT_DETECTIONS"
      }
    }
    ,
    {
      model_name: "postprocess_16"
      model_version: 1
      input_map
      {
        key: "OUTPUT_DETECTIONS"
        value: "OUTPUT_DETECTIONS"
      }
      output_map
      {
        key: "OUTPUT"
        value: "OUTPUT"
      }
    }
  ]
}

Can you provide the models, config files and app to reproduce the failure?

I cannot attach the specific models, but the issue can be easily reproduced with any Triton Deepstream app - simply add to the model’s config:

sequence_batching {
  state [
    {}
]

And the error will reproduce

I have been researching this for a few days - I think I can safely say that the problem isn’t in the Triton server side and model config - but in the Deepstream side, so I can rephrase the issue:

How to send correlation ID from the Deepstream Triton client side?

I am currently calling the Triton server using:

    pgie = Gst.ElementFactory.make("nvinferserver", "primary-inference")

And the inferserver config file is:

infer_config
{
  unique_id: 1
  gpu_ids: [0]
  max_batch_size: 1
  backend 
  {
    triton
    {
      model_name: "ensemble_python_smoke_16"
      version: -1
      model_repo 
      {
        root: "./triton_model_repo"
        log_level: 1
        strict_model_config: true
        # Triton runtime would reserve 64MB pinned memory
        pinned_memory_pool_byte_size: 67108864
        # Triton runtim would reserve 64MB CUDA device memory on GPU 0
        cuda_device_memory 
        {
          device: 0, memory_pool_byte_size: 67108864
        }
      }
    }
    output_mem_type: MEMORY_TYPE_CPU
  }
  
  preprocess
  {
    network_format: IMAGE_FORMAT_RGB
    tensor_order: TENSOR_ORDER_NHWC
    normalize
    {
      scale_factor: 1
    }
  }
}

output_control 
{
  output_tensor_meta: true
}

input_control 
{
  process_mode: PROCESS_MODE_FULL_FRAME
  interval: 0
}


Why is the input key of “postprocess_16” different to the output key of “smoke_16”? Can you set the same value and try?

Hello again
Still the same error

Can you get the log with “export GST_DEBUG=nvinferserver:7”?

Sure. I have tried many iterations and I am now confident that the problem is not in my settings - but in the lack of possibility to attach a “correlationid” from the Deepstream client request to Triton.

correlationid_error.log (47.8 KB)

Can you send the complete log?

I did in the previous comment, posting again with even more details (GST_DEBUG=5 and GST_DEBUG=7), both with Trition log_level=7

Thanks again
correlationid_error_debug_7.log (9.8 MB)
correlationid_error_gst_debug_5.log (4.9 MB)

Sorry for missing some information from you! We don’t need GST_DEBUG=7 log. Please use the following settings to get the log:
export GST_DEBUG=nvinferserver:7
export NVDSINFERSERVER_LOG_LEVEL=5

Sorry for the misunderstanding,

Attached here the requested log:

correlationid_error.log (47.9 KB)

Why did your log miss all nvinferserver log? Have you use “export NVDSINFERSERVER_LOG_LEVEL=5” before you run your case?

Yes I have - That is the whole log. It crashes after the first request to the model