TRTIS 19.02 Issue with variable output layers

lmcquarr · March 13, 2019, 4:35pm

I have a tensorflow saved model that I am attempting to host in the TensorRT inference server.

Several if the inputs and outputs are variable size. Since it is a tensorflow saved model, I attemped to use the --strict-model-config=false flag to automatically generate the config. It generated this: (note I have changed the input/oyutput names to obfuscate the purpose of the model) where it replaced the variable dimensions with -1:

[ec2-user@SC-DEV-UE1-PERMANENT_PDFStructureWorker_i-0e176a5891893917d ~]$ curl localhost:8000/api/status
id: “inference:0”
version: “0.11.0”
uptime_ns: 154084243188
model_status {
key: “test”
value {
config {
name: “test”
platform: “tensorflow_savedmodel”
version_policy {
latest {
num_versions: 1
}
}
max_batch_size: 1
input {
name: “input1”
data_type: TYPE_FP32
dims: -1
}
input {
name: “input2”
data_type: TYPE_FP32
dims: -1
}
input {
name: “input3”
data_type: TYPE_FP32
dims: -1
}
input {
name: “input4”
data_type: TYPE_FP32
dims: -1
}
input {
name: “input5”
data_type: TYPE_FP32
dims: -1
}
input {
name: “input6”
data_type: TYPE_FP32
dims: -1
}
output {
name: “output1”
data_type: TYPE_INT64
}
output {
name: “output2”
data_type: TYPE_FP32
dims: 7
}
output {
name: “output3”
data_type: TYPE_INT32
}
output {
name: “output4”
data_type: TYPE_INT32
}
output {
name: “output5”
data_type: TYPE_FP32
}
output {
name: “output6”
data_type: TYPE_INT32
}
output {
name: “output7”
data_type: TYPE_FP32
}
output {
name: “output8”
data_type: TYPE_INT32
}
output {
name: “output9”
data_type: TYPE_FP32
dims: 4
}
instance_group {
name: “test”
count: 1
gpus: 0
kind: KIND_GPU
}
default_model_filename: “test.savedmodel”
}
version_status {
key: 44
value {
ready_state: MODEL_UNAVAILABLE
}
}
}
}

As you can see with the model state above, it couldn’t determine on all the input and output dimensions.

So I created a config.pbtxt with the dimensions of the input and output layers carefully constructed.

I tried this with tensorRT Inference Server and I am getting his error:

I0313 15:01:01.556557 1 loader.cc:298] SavedModel load for tags { serve }; Status: success. Took 273490 microseconds.
E0313 15:01:01.573928 1 retrier.cc:37] Loading servable: {name: test version: 44} failed: Invalid argument: unable to load model ‘test’, output ‘output1’ dims [-1] don’t match configuration dims [-1]
I0313 15:01:01.573977 1 loader_harness.cc:154] Encountered an error for servable version {name:test version: 44}: Invalid argument: unable to load model ‘test’, output ‘output1’ dims [-1] don’t match configuration dims [-1]

If you look at the error, the for “output1” the config.pbtxt says the dimensions for output1 are -1, and the the configuration are -1, but they don’t match. Huh? What is going on here?

NVES · March 14, 2019, 3:17pm

we are triaging and will keep you updated

lmcquarr · March 14, 2019, 3:28pm

Thanks. Just in case, I am coming to GTC next week and will bring my model with me to ask the experts sessions . (TensorRT and TensorRT Inference server.)

lmcquarr · March 14, 2019, 6:45pm

After reading the docs, I answered my first question in that dims: [-1] is the correct way to represent variable dimension inputs and outputs:

https://docs.nvidia.com/deeplearning/sdk/tensorrt-inference-server-guide/docs/model_configuration.html

"For models that support input and output tensors with variable-size dimensions, those dimensions can be listed as -1 in the input and output configuration. For example, if a model requires a 2-dimensional input tensor where the first dimension must be size 4 but the second dimension can be any size, the model configuration for that input would include dims: [ 4, -1 ]. The inference server would then accept inference requests where that input tensor’s second dimension was any value >= 1. "

I still need help with the this TRTIS error:

E0313 15:01:01.573928 1 retrier.cc:37] Loading servable: {name: test version: 44} failed: Invalid argument: unable to load model ‘test’, output ‘output1’ dims [-1] don’t match configuration dims [-1]
I0313 15:01:01.573977 1 loader_harness.cc:154] Encountered an error for servable version {name:test version: 44}: Invalid argument: unable to load model ‘test’, output ‘output1’ dims [-1] don’t match configuration dims [-1]

lmcquarr · March 15, 2019, 6:55pm

OK, it really looks like TRTIS is having a very hard time with output layers with variable dimensions [-1].

I rearranged my config.pbtxt file several times and as soon as it encounters an output layer with variable dims, I get the same type of message:

I0315 18:45:45.386469 1 service.cc:161] XLA service 0x7f10f4361350 executing computations on platform Host. Devices:
I0315 18:45:45.386486 1 service.cc:168] StreamExecutor device (0): ,
I0315 18:45:45.437891 1 loader.cc:183] Restoring SavedModel bundle.
I0315 18:45:45.638201 1 loader.cc:133] Running initialization op on SavedModel bundle.
I0315 18:45:45.672171 1 loader.cc:298] SavedModel load for tags { serve }; Status: success. Took 1070302 microseconds.
E0315 18:45:45.689760 1 retrier.cc:37] Loading servable: {name: page_segmentation version: 44} failed: Invalid argument: unable to load model ‘page_segmentation’, output ‘bboxes’ dims [-1,4] don’t match configuration dims [-1,4]
I0315 18:45:45.689807 1 loader_harness.cc:154] Encountered an error for servable version {name: page_segmentation version: 44}: Invalid argument: unable to load model ‘page_segmentation’, output ‘bboxes’ dims [-1,4] don’t match configuration dims [-1,4]
E0315 18:45:45.689821 1 aspired_versions_manager.cc:358] Servable {name: page_segmentation version: 44} cannot be loaded: Invalid argument: unable to load model ‘page_segmentation’, output ‘bboxes’ dims [-1,4] don’t match configuration dims [-1,4]

I0315 18:41:00.394930 1 loader.cc:183] Restoring SavedModel bundle.
I0315 18:41:00.566804 1 loader.cc:133] Running initialization op on SavedModel bundle.
I0315 18:41:00.601682 1 loader.cc:298] SavedModel load for tags { serve }; Status: success. Took 276111 microseconds.
E0315 18:41:00.619399 1 retrier.cc:37] Loading servable: {name: page_segmentation version: 44} failed: Invalid argument: unable to load model ‘page_segmentation’, output ‘acp’ dims [-1,7] don’t match configuration dims [-1,7]
I0315 18:41:00.619448 1 loader_harness.cc:154] Encountered an error for servable version {name: page_segmentation version: 44}: Invalid argument: unable to load model ‘page_segmentation’, output ‘acp’ dims [-1,7] don’t match configuration dims [-1,7]
E0315 18:41:00.619463 1 aspired_versions_manager.cc:358] Servable {name: page_segmentation version: 44} cannot be loaded: Invalid argument: unable to load model ‘page_segmentation’, output ‘acp’ dims [-1,7] don’t match configuration dims [-1,7]

I am running the nvcr.io/nvidia/tensorrtserver:19.02-py3 docker image, which should have the support of [-1] output vectors in it according to this:

https://github.com/NVIDIA/tensorrt-inference-server/issues/8

NVES · March 19, 2019, 8:23pm

Hello,

per engineering: Notice that in the configuration output1 has no dimension. That means its shape is completely determined by the batch-size. In tensorflow terminology the shape for this tensor would be [ ? ], where the ? represents the batch-size.

As of 19.03 TRTIS does not support this type of tensor (that is, tensors that have no shape except for the batch dimension). We are fixing this issue for a future TRTIS version.

lmcquarr · March 20, 2019, 3:18pm

When I rearranged the config.pbtext to put another output first, TRTIS also complained:

output [
{
name: “bboxes”
data_type: TYPE_FP32
dims: [-1,4]
},

I0315 19:06:40.585618 1 service.cc:161] XLA service 0x7fd4d3d88750 executing computations on platform CUDA. Devices:
I0315 19:06:40.585638 1 service.cc:168] StreamExecutor device (0): Tesla V100-SXM2-16GB, Compute Capability 7.0
I0315 19:06:40.605912 1 cpu_utils.cc:94] CPU Frequency: 2300040000 Hz
I0315 19:06:40.606534 1 service.cc:161] XLA service 0x7fd4d3de6dd0 executing computations on platform Host. Devices:
I0315 19:06:40.606557 1 service.cc:168] StreamExecutor device (0): ,
I0315 19:06:40.658493 1 loader.cc:183] Restoring SavedModel bundle.
I0315 19:06:40.857047 1 loader.cc:133] Running initialization op on SavedModel bundle.
I0315 19:06:40.891437 1 loader.cc:298] SavedModel load for tags { serve }; Status: success. Took 1072032 microseconds.
E0315 19:06:40.908884 1 retrier.cc:37] Loading servable: {name: page_segmentation version: 44} failed: Invalid argument: unable to load model ‘page_segmentation’, output ‘bboxes’ dims [-1,4] don’t match configuration dims [-1,4]
I0315 19:06:40.908929 1 loader_harness.cc:154] Encountered an error for servable version {name: page_segmentation version: 44}: Invalid argument: unable to load model ‘page_segmentation’, output ‘bboxes’ dims [-1,4] don’t match configuration dims [-1,4]
E0315 19:06:40.908939 1 aspired_versions_manager.cc:358] Servable {name: page_segmentation version: 44} cannot be loaded: Invalid argument: unable to load model ‘page_segmentation’, output ‘bboxes’ dims [-1,4] don’t match configuration dims [-1,4]
^CI0315 19:06:59.537223 1 main.cc:50] Interrupt signal (2) received.
I

If TRTIS can’t support something like this, it seems it may not be ready for object detection networks?

(I will stop by booth at noon today to talk to the experts … GTC)

dakoda · August 2, 2019, 1:55pm

Sorry to necro, but I’m encountering a similar issue on TRTIS 19.03 (build 5810010).

I’m not sure if I’ve done something incorrectly wrt the variable dim syntax in the config (but I don’t think so?). Either way, the console error logging doesn’t elucidate much for me.

I0802 13:44:45.879639 1 loader.cc:183] Restoring SavedModel bundle.
I0802 13:44:46.245207 1 loader.cc:133] Running initialization op on SavedModel bundle.
I0802 13:44:46.288620 1 loader.cc:298] SavedModel load for tags { serve }; Status: success. Took 455975 microseconds.
E0802 13:44:46.306244 1 retrier.cc:37] Loading servable: {name: stuff_detection version: 1} failed: Invalid argument: unable to load model 'stuff_detection', output 'detection_boxes' dims [-1,-1,-1,5] don't match configuration dims [-1,-1,-1,5]
I0802 13:44:46.306273 1 loader_harness.cc:154] Encountered an error for servable version {name: stuff_detection version: 1}: Invalid argument: unable to load model 'stuff_detection', output 'detection_boxes' dims [-1,-1,-1,5] don't match configuration dims [-1,-1,-1,5]
E0802 13:44:46.306279 1 aspired_versions_manager.cc:358] Servable {name: stuff_detection version: 1} cannot be loaded: Invalid argument: unable to load model 'stuff_detection', output 'detection_boxes' dims [-1,-1,-1,5] don't match configuration dims [-1,-1,-1,5]

Any guidance as to mistakes made, incompatibilities, or potential workarounds would be hugely appreciated.

Edit: Is this related to the comment here https://github.com/NVIDIA/tensorrt-inference-server/blob/master/src/clients/python/grpc_image_client.py#L93?

Edit-Edit: I believe I resolved the issue, if so it was user error. I didn’t realize we aren’t intended to specify output batch dims in the config file. Removing the batch dims on output nodes seems to have worked.

Topic		Replies	Views
Some PyTorch model with slicing operation fails on inference TensorRT tensorrt , pytorch , onnx , deepstream	2	1434	January 7, 2022
Failure in verifying input shapes: Input shapes are inconsistent on the batch dimension TensorRT	8	1188	July 11, 2021
Tensorrt fails for custom ssd_inception Model TensorRT	18	2801	May 14, 2020
ONNX to TensorRT Python module doesn't generate dynamic batch size engine TensorRT tensorrt , cudnn , onnx	3	1062	October 20, 2023
Deploy Object Detection TF-TRT INT8 with DS Triton DeepStream SDK inference-server-triton	16	1299	October 12, 2021
TF-TRT failing to convert with INT32 values TensorRT	8	1054	April 15, 2019
Process killed during tensorrt conversion on Jetson orin NX (8 GB) Jetson Orin NX tensorrt	15	687	April 30, 2024
How adapt Tensorflow object detection for custom dataset to Deepstream 5.0 DeepStream SDK tensorflow	17	1978	July 27, 2021
Tlt-export ssd error for COCO dataset TAO Toolkit	2	454	October 12, 2021
LSTM ONNX to TensorRT mismatched outputs TensorRT tensorrt	3	934	September 29, 2022

TRTIS 19.02 Issue with variable output layers

Related topics