Triton Inference Engine Tensorflow Model Configuration expects 2 inputs, model provides 1

Hardware Platform (Jetson / GPU) → NVIDIA GeForce GTX 1650
• DeepStream Version → N/A
• JetPack Version (valid for Jetson only) N/A
• TensorRT Version → TensorRT 8.2.5.1
• NVIDIA GPU Driver Version (valid for GPU only) NVIDIA driver 515
• Issue Type( questions, new requirements, bugs) Question
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)

Hello there, I’m using Triton inference Server version 22.07
Currently I’m using a Savedmodel tensorflow model that requires only 1 input.

Thus my configurations looks as follows:

name: "simple-tensorflow-model"
platform: "tensorflow_savedmodel"
backend: "tensorflow"
max_batch_size: 32
input {
    name: "input_0"
    data_type: TYPE_FP32
    format: FORMAT_NCHW
    dims: [ 3, 640, 640 ]
  }
output [
  {
    name: "conf"
    data_type: TYPE_FP32
    dims: [ 16800, 2 ]
  },
  {
    name: "bbox"
    data_type: TYPE_FP32
    dims: [ 16800, 4 ]
  },
  {
    name: "landmarks"
    data_type: TYPE_FP32
    dims: [ 16800, 10 ]
  }
]

The triton Inference server outputs the following error:

E0816 22:49:44.267721 1 model_repository_manager.cc:1355] failed to load ‘simple-tensorflow-model’ version 1: Invalid argument: unable to load model ‘simple-tensorflow-model’, configuration expects 2 inputs, model provides 1

As stated, it says my configurations expects 2 inputs, which is not the case as seen above.

please provide the whole configuration file and terminal logs, from the error, tritonserver found configuration expects 2 inputs.
config.pbtxt (407 Bytes)
please refer to this config, input should be like this:
input [
{

}
]

Hello @fanzh
Many thanks, That solved the configuration problem. But i’m facing a new problem now and I’m not sure if it’s model related or is the problem with triton tensorflow backend.

The model is expected to output [-1, 16] tensor. The -1 is a dynamic value that is determined based on the number of objects detected.

As an example, this photo below:

This image outputs a tensor of size [25, 16] meaning the model detected 25 faces.

However, when using Triton, for some reason the model always outputs only 1 detection [1, 16]

Here’s the configuration file I’m using now:

name: "retinaface-tf"
platform: "tensorflow_savedmodel"
backend: "tensorflow"
max_batch_size : 32
input [
  {
    name: "input_image"
    data_type: TYPE_FP32
#    format: FORMAT_NCHW
    dims: [-1,-1,-1,3]
  }
]
output [
  {
    name: "tf_op_layer_GatherV2"
    data_type: TYPE_FP32
    dims: [ -1 , 16 ]
  }
]

Here’s the script I’m using to run the client:

import tritonclient.http as tritonhttpclient
import numpy as np
from PIL import Image
from absl import app, flags, logging
from absl.flags import FLAGS
import cv2
import os
import numpy as np
import tensorflow as tf
import time

from modules.models import RetinaFaceModel
from modules.utils import (set_memory_growth, load_yaml, draw_bbox_landm,
                           pad_input_image, recover_pad_output)

# Triton: ===============
VERBOSE = False
input_name = 'input_image'
input_shape = (1, 640, 640, 3) # (-1,-1,-1,3)   
input_dtype = 'FP32'
#output_names = ["conf","bbox", "landmarks"]
output_name = "tf_op_layer_GatherV2"
model_name = 'retinaface-tf'
url = '0.0.0.0:8000'
model_version = '1'


# Model ===============

cfg = load_yaml("./configs/retinaface_res50.yaml")

# define network
model = RetinaFaceModel(cfg, training=False, iou_th=0.4,
                        score_th=0.5)

# load checkpoint
checkpoint_dir = './checkpoints/' + cfg['sub_name']
checkpoint = tf.train.Checkpoint(model=model)
if tf.train.latest_checkpoint(checkpoint_dir):
    checkpoint.restore(tf.train.latest_checkpoint(checkpoint_dir))
    print("[*] load ckpt from {}.".format(
        tf.train.latest_checkpoint(checkpoint_dir)))
else:
    print("[*] Cannot find ckpt from {}.".format(checkpoint_dir))
    exit()


# Image ===============

#set_memory_growth()
img_raw = cv2.imread('./data/0_Parade_marchingband_1_149.jpg')
img_height_raw, img_width_raw, _ = img_raw.shape
img = np.float32(img_raw.copy())

img = cv2.resize(img, (640, 640), interpolation=cv2.INTER_LINEAR)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

# pad input image to avoid unmatched shape problem
img, pad_params = pad_input_image(img, max_steps=max(cfg['steps']))

#img = img.reshape([3, 640, 640])
print(np.shape(img))
print(np.shape(img[np.newaxis, ...]))

# run model
image_numpy = img[np.newaxis, ...]

triton_client = tritonhttpclient.InferenceServerClient(url=url, verbose=VERBOSE)
model_metadata = triton_client.get_model_metadata(model_name=model_name, model_version=model_version)
model_config = triton_client.get_model_config(model_name=model_name, model_version=model_version)
print(model_config)



input0 = tritonhttpclient.InferInput(input_name, input_shape, input_dtype)
input0.set_data_from_numpy(image_numpy, binary_data=False)

# outputs = []
# for output_name in output_names:
#     outputs.append(tritonhttpclient.InferRequestedOutput(output_name, binary_data=True))

# response = triton_client.infer(model_name, model_version=model_version, 
#                                inputs=[input0], outputs=outputs)


output = tritonhttpclient.InferRequestedOutput(output_name, binary_data=False)
response = triton_client.infer(model_name, model_version=model_version, 
                               inputs=[input0], outputs=[output])
logits = response.as_numpy(output_name)
logits = np.asarray(logits, dtype=np.float32)
print(logits.shape)

print(response)
#logits = response.as_numpy(output_name)
#print(logits)
#print(np.shape(logits))

#logits = np.asarray(logits, dtype=np.float32)
#print(logits.shape)
#print(logits)


# recover padding effect
outputs = recover_pad_output(logits, pad_params)
print(outputs.shape)

# draw and save results
for prior_index in range(len(outputs)):
    draw_bbox_landm(img_raw, outputs[prior_index], img_height_raw,
                    img_width_raw)
    cv2.imwrite("./outputs/image.jpg", img_raw)
print(f"[*] save result at /output")

Here’s my model structure:

.
├── retinaface-tf
│   ├── 1
│   |   ├── config.pbtxt
│   │   └── model.savedmodel
│   │       ├── assets
│   │       ├── saved_model.pb
│   │       └── variables
│   │           ├── variables.data-00000-of-00001
│   │           └── variables.index

Here’s the model link uploaded to google drive: retinaface-tf - Google Drive

when performing health check on the model here’s the output:

yousef@yousef-Dell-G15-5510:~/Desktop/Triton-models/retinaface-tf$ curl -v 0.0.0.0:8000/v2/models/retinaface-tf
*   Trying 0.0.0.0:8000...
* TCP_NODELAY set
* Connected to 0.0.0.0 (127.0.0.1) port 8000 (#0)
> GET /v2/models/retinaface-tf HTTP/1.1
> Host: 0.0.0.0:8000
> User-Agent: curl/7.68.0
> Accept: */*
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Content-Type: application/json
< Content-Length: 227
< 
* Connection #0 to host 0.0.0.0 left intact
{"name":"retinaface-tf","versions":["1"],"platform":"tensorflow_savedmodel","inputs":[{"name":"input_image","datatype":"FP32","shape":[-1,-1,-1,3]}],"outputs":[{"name":"tf_op_layer_GatherV2","datatype":"FP32","shape":[-1,16]}]}

Kindly advise, thank you.

1, why there is " dims: [-1,-1,-1,3]" ?
2. please check your post processing part, you can print the size of outputs.

1- The models takes as input a format of NHWC
Which means a dynamic batch size, dynamic height and width, and 3 channels.

2- This print is already included inside the provided infer script above:

output = tritonhttpclient.InferRequestedOutput(output_name, binary_data=False)
response = triton_client.infer(model_name, model_version=model_version, 
                               inputs=[input0], outputs=[output])
logits = response.as_numpy(output_name)
logits = np.asarray(logits, dtype=np.float32)
print(logits.shape)

Note this print is before applying any pre-processing to the model outputs

Kindly check the entire script attached above again.

  1. did you succeed to run your model by other inference tool? please make sure your model can work.
  2. do you mean inference will output one face before postprocessing? if yes, you need to check the data that feed to model, for example: compare the model input data between your code and other inference tool.
  1. Yes, inference using tensorflow is provided above (image and tensor output size)
  2. The input image, pre-processing and everything is exactly the same.

Can you maybe try to reproduce? I’ve provided the model and the scripts above.

Any further update? Is this still an issue to support? Thanks
using the sample code with yours, from the tritonserver log, there was only one detection, please check the model input data.
I0903 10:08:28.037014 19002 infer_response.cc:166] add response output: output: tf_op_layer_GatherV2, type: FP32, shape: [1,16].

There is no update from you for a period, assuming this is not an issue any more.
Hence we are closing this topic. If need further support, please open a new one.
Thanks

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.