Triton Inference Engine Tensorflow Model Configuration expects 2 inputs, model provides 1

yousef.hesham1 · August 16, 2022, 10:55pm

Hardware Platform (Jetson / GPU) → NVIDIA GeForce GTX 1650
• DeepStream Version → N/A
• JetPack Version (valid for Jetson only) N/A
• TensorRT Version → TensorRT 8.2.5.1
• NVIDIA GPU Driver Version (valid for GPU only) NVIDIA driver 515
• Issue Type( questions, new requirements, bugs) Question
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)

Hello there, I’m using Triton inference Server version 22.07
Currently I’m using a Savedmodel tensorflow model that requires only 1 input.

Thus my configurations looks as follows:

name: "simple-tensorflow-model"
platform: "tensorflow_savedmodel"
backend: "tensorflow"
max_batch_size: 32
input {
    name: "input_0"
    data_type: TYPE_FP32
    format: FORMAT_NCHW
    dims: [ 3, 640, 640 ]
  }
output [
  {
    name: "conf"
    data_type: TYPE_FP32
    dims: [ 16800, 2 ]
  },
  {
    name: "bbox"
    data_type: TYPE_FP32
    dims: [ 16800, 4 ]
  },
  {
    name: "landmarks"
    data_type: TYPE_FP32
    dims: [ 16800, 10 ]
  }
]

The triton Inference server outputs the following error:

E0816 22:49:44.267721 1 model_repository_manager.cc:1355] failed to load ‘simple-tensorflow-model’ version 1: Invalid argument: unable to load model ‘simple-tensorflow-model’, configuration expects 2 inputs, model provides 1

As stated, it says my configurations expects 2 inputs, which is not the case as seen above.

fanzh · August 18, 2022, 11:13am

please provide the whole configuration file and terminal logs, from the error, tritonserver found configuration expects 2 inputs.
config.pbtxt (407 Bytes)
please refer to this config, input should be like this:
input [
{
…
}
]

yousef.hesham1 · August 22, 2022, 5:19pm

Hello @fanzh
Many thanks, That solved the configuration problem. But i’m facing a new problem now and I’m not sure if it’s model related or is the problem with triton tensorflow backend.

The model is expected to output [-1, 16] tensor. The -1 is a dynamic value that is determined based on the number of objects detected.

As an example, this photo below:

This image outputs a tensor of size [25, 16] meaning the model detected 25 faces.

However, when using Triton, for some reason the model always outputs only 1 detection [1, 16]

Here’s the configuration file I’m using now:

name: "retinaface-tf"
platform: "tensorflow_savedmodel"
backend: "tensorflow"
max_batch_size : 32
input [
  {
    name: "input_image"
    data_type: TYPE_FP32
#    format: FORMAT_NCHW
    dims: [-1,-1,-1,3]
  }
]
output [
  {
    name: "tf_op_layer_GatherV2"
    data_type: TYPE_FP32
    dims: [ -1 , 16 ]
  }
]

Here’s the script I’m using to run the client:

import tritonclient.http as tritonhttpclient
import numpy as np
from PIL import Image
from absl import app, flags, logging
from absl.flags import FLAGS
import cv2
import os
import numpy as np
import tensorflow as tf
import time

from modules.models import RetinaFaceModel
from modules.utils import (set_memory_growth, load_yaml, draw_bbox_landm,
                           pad_input_image, recover_pad_output)

# Triton: ===============
VERBOSE = False
input_name = 'input_image'
input_shape = (1, 640, 640, 3) # (-1,-1,-1,3)   
input_dtype = 'FP32'
#output_names = ["conf","bbox", "landmarks"]
output_name = "tf_op_layer_GatherV2"
model_name = 'retinaface-tf'
url = '0.0.0.0:8000'
model_version = '1'


# Model ===============

cfg = load_yaml("./configs/retinaface_res50.yaml")

# define network
model = RetinaFaceModel(cfg, training=False, iou_th=0.4,
                        score_th=0.5)

# load checkpoint
checkpoint_dir = './checkpoints/' + cfg['sub_name']
checkpoint = tf.train.Checkpoint(model=model)
if tf.train.latest_checkpoint(checkpoint_dir):
    checkpoint.restore(tf.train.latest_checkpoint(checkpoint_dir))
    print("[*] load ckpt from {}.".format(
        tf.train.latest_checkpoint(checkpoint_dir)))
else:
    print("[*] Cannot find ckpt from {}.".format(checkpoint_dir))
    exit()


# Image ===============

#set_memory_growth()
img_raw = cv2.imread('./data/0_Parade_marchingband_1_149.jpg')
img_height_raw, img_width_raw, _ = img_raw.shape
img = np.float32(img_raw.copy())

img = cv2.resize(img, (640, 640), interpolation=cv2.INTER_LINEAR)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

# pad input image to avoid unmatched shape problem
img, pad_params = pad_input_image(img, max_steps=max(cfg['steps']))

#img = img.reshape([3, 640, 640])
print(np.shape(img))
print(np.shape(img[np.newaxis, ...]))

# run model
image_numpy = img[np.newaxis, ...]

triton_client = tritonhttpclient.InferenceServerClient(url=url, verbose=VERBOSE)
model_metadata = triton_client.get_model_metadata(model_name=model_name, model_version=model_version)
model_config = triton_client.get_model_config(model_name=model_name, model_version=model_version)
print(model_config)



input0 = tritonhttpclient.InferInput(input_name, input_shape, input_dtype)
input0.set_data_from_numpy(image_numpy, binary_data=False)

# outputs = []
# for output_name in output_names:
#     outputs.append(tritonhttpclient.InferRequestedOutput(output_name, binary_data=True))

# response = triton_client.infer(model_name, model_version=model_version, 
#                                inputs=[input0], outputs=outputs)


output = tritonhttpclient.InferRequestedOutput(output_name, binary_data=False)
response = triton_client.infer(model_name, model_version=model_version, 
                               inputs=[input0], outputs=[output])
logits = response.as_numpy(output_name)
logits = np.asarray(logits, dtype=np.float32)
print(logits.shape)

print(response)
#logits = response.as_numpy(output_name)
#print(logits)
#print(np.shape(logits))

#logits = np.asarray(logits, dtype=np.float32)
#print(logits.shape)
#print(logits)


# recover padding effect
outputs = recover_pad_output(logits, pad_params)
print(outputs.shape)

# draw and save results
for prior_index in range(len(outputs)):
    draw_bbox_landm(img_raw, outputs[prior_index], img_height_raw,
                    img_width_raw)
    cv2.imwrite("./outputs/image.jpg", img_raw)
print(f"[*] save result at /output")

Here’s my model structure:

.
├── retinaface-tf
│   ├── 1
│   |   ├── config.pbtxt
│   │   └── model.savedmodel
│   │       ├── assets
│   │       ├── saved_model.pb
│   │       └── variables
│   │           ├── variables.data-00000-of-00001
│   │           └── variables.index

Here’s the model link uploaded to google drive: retinaface-tf - Google Drive

when performing health check on the model here’s the output:

yousef@yousef-Dell-G15-5510:~/Desktop/Triton-models/retinaface-tf$ curl -v 0.0.0.0:8000/v2/models/retinaface-tf
*   Trying 0.0.0.0:8000...
* TCP_NODELAY set
* Connected to 0.0.0.0 (127.0.0.1) port 8000 (#0)
> GET /v2/models/retinaface-tf HTTP/1.1
> Host: 0.0.0.0:8000
> User-Agent: curl/7.68.0
> Accept: */*
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Content-Type: application/json
< Content-Length: 227
< 
* Connection #0 to host 0.0.0.0 left intact
{"name":"retinaface-tf","versions":["1"],"platform":"tensorflow_savedmodel","inputs":[{"name":"input_image","datatype":"FP32","shape":[-1,-1,-1,3]}],"outputs":[{"name":"tf_op_layer_GatherV2","datatype":"FP32","shape":[-1,16]}]}

Kindly advise, thank you.

fanzh · August 26, 2022, 11:06am

1, why there is " dims: [-1,-1,-1,3]" ?
2. please check your post processing part, you can print the size of outputs.

yousef.hesham1 · August 31, 2022, 1:58pm

1- The models takes as input a format of NHWC
Which means a dynamic batch size, dynamic height and width, and 3 channels.

2- This print is already included inside the provided infer script above:

output = tritonhttpclient.InferRequestedOutput(output_name, binary_data=False)
response = triton_client.infer(model_name, model_version=model_version, 
                               inputs=[input0], outputs=[output])
logits = response.as_numpy(output_name)
logits = np.asarray(logits, dtype=np.float32)
print(logits.shape)

Note this print is before applying any pre-processing to the model outputs

Kindly check the entire script attached above again.

fanzh · August 31, 2022, 2:40pm

did you succeed to run your model by other inference tool? please make sure your model can work.
do you mean inference will output one face before postprocessing? if yes, you need to check the data that feed to model, for example: compare the model input data between your code and other inference tool.

yousef.hesham1 · August 31, 2022, 2:43pm

Yes, inference using tensorflow is provided above (image and tensor output size)
The input image, pre-processing and everything is exactly the same.

Can you maybe try to reproduce? I’ve provided the model and the scripts above.

fanzh · September 5, 2022, 2:10am

Any further update? Is this still an issue to support? Thanks
using the sample code with yours, from the tritonserver log, there was only one detection, please check the model input data.
I0903 10:08:28.037014 19002 infer_response.cc:166] add response output: output: tf_op_layer_GatherV2, type: FP32, shape: [1,16].

fanzh · September 19, 2022, 1:25am

There is no update from you for a period, assuming this is not an issue any more.
Hence we are closing this topic. If need further support, please open a new one.
Thanks

system · October 3, 2022, 1:25am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Trying to run TensorFlow 1.15 produced graphdefs with TF2 based tensorRT but TensorRT model is not building correctly TensorRT	6	991	July 15, 2021
Trying to run TensorFlow 1.15 and 2.4.1 produced graphdefs with TF2 based tensorRT but Triton Server inference not working correctly Triton Inference Server - archived tensorrt , tensorflow , python , inference-server-triton , machine-learning	0	945	May 24, 2021
Deploying Models from TensorFlow Model Zoo Using NVIDIA DeepStream and NVIDIA Triton Inference Server DeepStream SDK	3	8913	February 29, 2024
Triton inference server with SSD : interpreting responses TAO Toolkit tensorrt , inference-server-triton , tao	2	648	October 6, 2023
Triton inference server is sending back "HTTP/1.1 400 Bad Request" TAO Toolkit	6	3430	October 12, 2021
Triton Server Crashing Running Centerpoint Keypoint (hourglass_512x512_kpts) on Jetson via Dockerized Triton Jetson TX2 jetson-inference , docker , inference-server-triton	6	1171	February 9, 2022
Deploying Models from TensorFlow Model Zoo Using NVIDIA DeepStream and NVIDIA Triton Inference Server Technical Blog	13	1182	May 25, 2022
Custom model with deepstream sdk DeepStream SDK tensorrt	24	1195	October 12, 2021
Seeking Guidance on Building DeepStream Image with Triton Inference Server and TensorFlow GPU Support DeepStream SDK	7	442	March 23, 2024
Trying to run TensorFlow 1.15 produced graphdefs with TF2 based tensorRT but TensorRT model is not building correctly TensorRT tensorrt , tensorflow , python , inference-server-triton , machine-learning	4	950	May 13, 2021

Triton Inference Engine Tensorflow Model Configuration expects 2 inputs, model provides 1

Related topics