TRTIS 19.02 Issue with variable output layers

I have a tensorflow saved model that I am attempting to host in the TensorRT inference server.

Several if the inputs and outputs are variable size. Since it is a tensorflow saved model, I attemped to use the --strict-model-config=false flag to automatically generate the config. It generated this: (note I have changed the input/oyutput names to obfuscate the purpose of the model) where it replaced the variable dimensions with -1:

[ec2-user@SC-DEV-UE1-PERMANENT_PDFStructureWorker_i-0e176a5891893917d ~]$ curl localhost:8000/api/status
id: “inference:0”
version: “0.11.0”
uptime_ns: 154084243188
model_status {
key: “test”
value {
config {
name: “test”
platform: “tensorflow_savedmodel”
version_policy {
latest {
num_versions: 1
}
}
max_batch_size: 1
input {
name: “input1”
data_type: TYPE_FP32
dims: -1
}
input {
name: “input2”
data_type: TYPE_FP32
dims: -1
}
input {
name: “input3”
data_type: TYPE_FP32
dims: -1
}
input {
name: “input4”
data_type: TYPE_FP32
dims: -1
}
input {
name: “input5”
data_type: TYPE_FP32
dims: -1
}
input {
name: “input6”
data_type: TYPE_FP32
dims: -1
}
output {
name: “output1”
data_type: TYPE_INT64
}
output {
name: “output2”
data_type: TYPE_FP32
dims: 7
}
output {
name: “output3”
data_type: TYPE_INT32
}
output {
name: “output4”
data_type: TYPE_INT32
}
output {
name: “output5”
data_type: TYPE_FP32
}
output {
name: “output6”
data_type: TYPE_INT32
}
output {
name: “output7”
data_type: TYPE_FP32
}
output {
name: “output8”
data_type: TYPE_INT32
}
output {
name: “output9”
data_type: TYPE_FP32
dims: 4
}
instance_group {
name: “test”
count: 1
gpus: 0
kind: KIND_GPU
}
default_model_filename: “test.savedmodel”
}
version_status {
key: 44
value {
ready_state: MODEL_UNAVAILABLE
}
}
}
}

As you can see with the model state above, it couldn’t determine on all the input and output dimensions.

So I created a config.pbtxt with the dimensions of the input and output layers carefully constructed.

I tried this with tensorRT Inference Server and I am getting his error:

I0313 15:01:01.556557 1 loader.cc:298] SavedModel load for tags { serve }; Status: success. Took 273490 microseconds.
E0313 15:01:01.573928 1 retrier.cc:37] Loading servable: {name: test version: 44} failed: Invalid argument: unable to load model ‘test’, output ‘output1’ dims [-1] don’t match configuration dims [-1]
I0313 15:01:01.573977 1 loader_harness.cc:154] Encountered an error for servable version {name:test version: 44}: Invalid argument: unable to load model ‘test’, output ‘output1’ dims [-1] don’t match configuration dims [-1]

If you look at the error, the for “output1” the config.pbtxt says the dimensions for output1 are -1, and the the configuration are -1, but they don’t match. Huh? What is going on here?

we are triaging and will keep you updated

Thanks. Just in case, I am coming to GTC next week and will bring my model with me to ask the experts sessions . (TensorRT and TensorRT Inference server.)

After reading the docs, I answered my first question in that dims: [-1] is the correct way to represent variable dimension inputs and outputs:

https://docs.nvidia.com/deeplearning/sdk/tensorrt-inference-server-guide/docs/model_configuration.html

"For models that support input and output tensors with variable-size dimensions, those dimensions can be listed as -1 in the input and output configuration. For example, if a model requires a 2-dimensional input tensor where the first dimension must be size 4 but the second dimension can be any size, the model configuration for that input would include dims: [ 4, -1 ]. The inference server would then accept inference requests where that input tensor’s second dimension was any value >= 1. "

I still need help with the this TRTIS error:

E0313 15:01:01.573928 1 retrier.cc:37] Loading servable: {name: test version: 44} failed: Invalid argument: unable to load model ‘test’, output ‘output1’ dims [-1] don’t match configuration dims [-1]
I0313 15:01:01.573977 1 loader_harness.cc:154] Encountered an error for servable version {name:test version: 44}: Invalid argument: unable to load model ‘test’, output ‘output1’ dims [-1] don’t match configuration dims [-1]

OK, it really looks like TRTIS is having a very hard time with output layers with variable dimensions [-1].

I rearranged my config.pbtxt file several times and as soon as it encounters an output layer with variable dims, I get the same type of message:

I0315 18:45:45.386469 1 service.cc:161] XLA service 0x7f10f4361350 executing computations on platform Host. Devices:
I0315 18:45:45.386486 1 service.cc:168] StreamExecutor device (0): ,
I0315 18:45:45.437891 1 loader.cc:183] Restoring SavedModel bundle.
I0315 18:45:45.638201 1 loader.cc:133] Running initialization op on SavedModel bundle.
I0315 18:45:45.672171 1 loader.cc:298] SavedModel load for tags { serve }; Status: success. Took 1070302 microseconds.
E0315 18:45:45.689760 1 retrier.cc:37] Loading servable: {name: page_segmentation version: 44} failed: Invalid argument: unable to load model ‘page_segmentation’, output ‘bboxes’ dims [-1,4] don’t match configuration dims [-1,4]
I0315 18:45:45.689807 1 loader_harness.cc:154] Encountered an error for servable version {name: page_segmentation version: 44}: Invalid argument: unable to load model ‘page_segmentation’, output ‘bboxes’ dims [-1,4] don’t match configuration dims [-1,4]
E0315 18:45:45.689821 1 aspired_versions_manager.cc:358] Servable {name: page_segmentation version: 44} cannot be loaded: Invalid argument: unable to load model ‘page_segmentation’, output ‘bboxes’ dims [-1,4] don’t match configuration dims [-1,4]

I0315 18:41:00.394930 1 loader.cc:183] Restoring SavedModel bundle.
I0315 18:41:00.566804 1 loader.cc:133] Running initialization op on SavedModel bundle.
I0315 18:41:00.601682 1 loader.cc:298] SavedModel load for tags { serve }; Status: success. Took 276111 microseconds.
E0315 18:41:00.619399 1 retrier.cc:37] Loading servable: {name: page_segmentation version: 44} failed: Invalid argument: unable to load model ‘page_segmentation’, output ‘acp’ dims [-1,7] don’t match configuration dims [-1,7]
I0315 18:41:00.619448 1 loader_harness.cc:154] Encountered an error for servable version {name: page_segmentation version: 44}: Invalid argument: unable to load model ‘page_segmentation’, output ‘acp’ dims [-1,7] don’t match configuration dims [-1,7]
E0315 18:41:00.619463 1 aspired_versions_manager.cc:358] Servable {name: page_segmentation version: 44} cannot be loaded: Invalid argument: unable to load model ‘page_segmentation’, output ‘acp’ dims [-1,7] don’t match configuration dims [-1,7]

I am running the nvcr.io/nvidia/tensorrtserver:19.02-py3 docker image, which should have the support of [-1] output vectors in it according to this:

https://github.com/NVIDIA/tensorrt-inference-server/issues/8

Hello,

per engineering: Notice that in the configuration output1 has no dimension. That means its shape is completely determined by the batch-size. In tensorflow terminology the shape for this tensor would be [ ? ], where the ? represents the batch-size.

As of 19.03 TRTIS does not support this type of tensor (that is, tensors that have no shape except for the batch dimension). We are fixing this issue for a future TRTIS version.

When I rearranged the config.pbtext to put another output first, TRTIS also complained:

output [
{
name: “bboxes”
data_type: TYPE_FP32
dims: [-1,4]
},

I0315 19:06:40.585618 1 service.cc:161] XLA service 0x7fd4d3d88750 executing computations on platform CUDA. Devices:
I0315 19:06:40.585638 1 service.cc:168] StreamExecutor device (0): Tesla V100-SXM2-16GB, Compute Capability 7.0
I0315 19:06:40.605912 1 cpu_utils.cc:94] CPU Frequency: 2300040000 Hz
I0315 19:06:40.606534 1 service.cc:161] XLA service 0x7fd4d3de6dd0 executing computations on platform Host. Devices:
I0315 19:06:40.606557 1 service.cc:168] StreamExecutor device (0): ,
I0315 19:06:40.658493 1 loader.cc:183] Restoring SavedModel bundle.
I0315 19:06:40.857047 1 loader.cc:133] Running initialization op on SavedModel bundle.
I0315 19:06:40.891437 1 loader.cc:298] SavedModel load for tags { serve }; Status: success. Took 1072032 microseconds.
E0315 19:06:40.908884 1 retrier.cc:37] Loading servable: {name: page_segmentation version: 44} failed: Invalid argument: unable to load model ‘page_segmentation’, output ‘bboxes’ dims [-1,4] don’t match configuration dims [-1,4]
I0315 19:06:40.908929 1 loader_harness.cc:154] Encountered an error for servable version {name: page_segmentation version: 44}: Invalid argument: unable to load model ‘page_segmentation’, output ‘bboxes’ dims [-1,4] don’t match configuration dims [-1,4]
E0315 19:06:40.908939 1 aspired_versions_manager.cc:358] Servable {name: page_segmentation version: 44} cannot be loaded: Invalid argument: unable to load model ‘page_segmentation’, output ‘bboxes’ dims [-1,4] don’t match configuration dims [-1,4]
^CI0315 19:06:59.537223 1 main.cc:50] Interrupt signal (2) received.
I

If TRTIS can’t support something like this, it seems it may not be ready for object detection networks?

(I will stop by booth at noon today to talk to the experts … GTC)

Sorry to necro, but I’m encountering a similar issue on TRTIS 19.03 (build 5810010).

I’m not sure if I’ve done something incorrectly wrt the variable dim syntax in the config (but I don’t think so?). Either way, the console error logging doesn’t elucidate much for me.

I0802 13:44:45.879639 1 loader.cc:183] Restoring SavedModel bundle.
I0802 13:44:46.245207 1 loader.cc:133] Running initialization op on SavedModel bundle.
I0802 13:44:46.288620 1 loader.cc:298] SavedModel load for tags { serve }; Status: success. Took 455975 microseconds.
E0802 13:44:46.306244 1 retrier.cc:37] Loading servable: {name: stuff_detection version: 1} failed: Invalid argument: unable to load model 'stuff_detection', output 'detection_boxes' dims [-1,-1,-1,5] don't match configuration dims [-1,-1,-1,5]
I0802 13:44:46.306273 1 loader_harness.cc:154] Encountered an error for servable version {name: stuff_detection version: 1}: Invalid argument: unable to load model 'stuff_detection', output 'detection_boxes' dims [-1,-1,-1,5] don't match configuration dims [-1,-1,-1,5]
E0802 13:44:46.306279 1 aspired_versions_manager.cc:358] Servable {name: stuff_detection version: 1} cannot be loaded: Invalid argument: unable to load model 'stuff_detection', output 'detection_boxes' dims [-1,-1,-1,5] don't match configuration dims [-1,-1,-1,5]

Any guidance as to mistakes made, incompatibilities, or potential workarounds would be hugely appreciated.

Edit: Is this related to the comment here https://github.com/NVIDIA/tensorrt-inference-server/blob/master/src/clients/python/grpc_image_client.py#L93?

Edit-Edit: I believe I resolved the issue, if so it was user error. I didn’t realize we aren’t intended to specify output batch dims in the config file. Removing the batch dims on output nodes seems to have worked.