Tensorrt fails for custom ssd_inception Model

varun365 · December 3, 2018, 6:54am

Linux version : Ubuntu 16.04 LTS
GPU type : GeForce GTX 1080
nvidia driver version : 410.72
CUDA version : 9.0
CUDNN version : 7.0.5
Python version [if using python] : 3.5.2
Tensorflow version : tensorflow-gpu 1.9
TensorRT version : 5.0.2.6

Actual Problem,

I tried the example script under samples/python/uff_ssd folder. The Script downloads SSD_inception model, creates uff parser, builds engine and performs inference on Image.

Now, instead of downloading a pre-trained model, I trained my own object_detection model using SSD_inception as architecture. But am getting following errors:
[TensorRT] ERROR: Parameter check failed at: …/builder/Layers.h::setAxis::315, condition: axis>=0
[TensorRT] ERROR: Concatenate/concat: all concat input tensors must have the same dimensions except on the concatenation axis
[TensorRT] ERROR: UFFParser: Parser error: BoxPredictor_0/ClassPredictor/BiasAdd: The input to the Scale Layer is required to have a minimum of 3 dimensions.

I am using the same ssd_inception architecture, but still getting this error. Could anyone help me with this issue?

NVES_K · January 18, 2019, 10:43pm

If you’re able to send the model I can take a look at it and see if we can work it out.

varun365 · January 21, 2019, 10:38am

Hi KevinSchlichter

Thanks for looking into the issue, I have attached the link to pb file

https://drive.google.com/open?id=1TwgzXmv9OZ_yKG98BTT-cXNo8L0B2Ymt

NVES_K · January 25, 2019, 12:51am

Can you give me a step-by-step of how you’re using the model? I’m getting different errors, so I’m doing something differently.

varun365 · January 25, 2019, 6:00am

Initially I am running the uff_ssd/detect_objects.py . This:
1)Creates a workspace folder, downloads and extracts the pretrained SSD_INCEPTION model (ssd_inception_v2_coco_2017_11_17)
2) Converts the frozen_inference_graph.pb to frozen_inference_graph.uff.
3) Builds tensorrt engine using the uff.
4) Performs inference using the tensorrt engine, got image_inferred,jpg as output.

My modifactions:

I removed the tensorrt engine file that got created in workspace folder.
I removed the frozen_inference_graph.pb and frozen_inference_graph.uff from models/ssd_inception_v2_coco_2017_11_17 folder inside workspace.
Added my custom frozen_inference_graph.pb to workspace/models/ssd_inception_v2_coco_2017_11_17
Modified coco.py inside uff_ssd/utils according to my no.of classes.
Modified model.py inside uff_ssd/utils like following:
—> line 91 numClasses=91 to numClasses=6
—> line 238 commented the download part #download_model(model_name, silent)
—> line 239 ssd_pb_path = PATHS.get_model_pb_path(model_name) to ssd_pb_path = <‘path_to_custom.pb file present in uff_ssd/worspace/models/ssd_inception_v2_coco_2017_11_17’>

Also, can you tell what errors that you are getting? Were you able to create the tensorrt engine and run inference. Can you share your procedure to execute?

varun365 · January 29, 2019, 8:41am

Hi KevinSchlichter

If possible could you share the errors you’re getting please ?

NVES_K · January 29, 2019, 9:17pm

I’m not sure what I was doing last week, but it’s working for me now.

#Started a container
nvidia-docker run -v /home/nvesk/:/workspace/nvesk -ti --rm nvcr.io/nvidia/tensorrt:18.12-py3

history
1 /opt/tensorrt/python/python_setup.sh
2 cd tensorrt/samples/python/uff_ssd/
3 cat README.md
4 mkdir build
5 cd build/
6 cmake …
7 make
8 cd …
#I already downloaded this from last week, so I’m skipping the wget
9 cp /workspace/nvesk/VOCtest_06-Nov-2007.tar .
10 tar xf VOCtest_06-Nov-2007.tar
11 python detect_objects.py images/image2.jpg
12 rm workspace/models/ssd_inception_v2_coco_2017_11_17/frozen_inference_graph.pb workspace/models/ssd_inception_v2_coco_2017_11_17/frozen_inference_graph.pbtxt workspace/models/ssd_inception_v2_coco_2017_11_17/frozen_inference_graph.uff
13 cp /workspace/nvesk/frozen_inference_graph_custom.pb workspace/models/ssd_inception_v2_coco_2017_11_17/frozen_inference_graph.pb
14 python detect_objects.py images/image2.jpg
15 vi utils/model.py

I only commented out line 238: #download_model(model_name, silent)

16 python detect_objects.py images/image2.jpg
output:

WARNING:tensorflow:From /usr/lib/python3.5/dist-packages/graphsurgeon/StaticGraph.py:123: FastGFile.__init__ (from tensorflow.python.platform.gfile) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.gfile.GFile.
WARNING: To create TensorRT plugin nodes, please use the `create_plugin_node` function instead.
UFF Version 0.5.5
=== Automatically deduced input nodes ===
[name: "Input"
op: "Placeholder"
attr {
  key: "dtype"
  value {
    type: DT_FLOAT
  }
}
attr {
  key: "shape"
  value {
    shape {
      dim {
        size: 1
      }
      dim {
        size: 3
      }
      dim {
        size: 300
      }
      dim {
        size: 300
      }
    }
  }
}
]
=========================================

Using output node NMS
Converting to UFF graph
Warning: No conversion function registered for layer: NMS_TRT yet.
Converting NMS as custom op: NMS_TRT
Warning: No conversion function registered for layer: FlattenConcat_TRT yet.
Converting concat_box_conf as custom op: FlattenConcat_TRT
Warning: No conversion function registered for layer: GridAnchor_TRT yet.
Converting GridAnchor as custom op: GridAnchor_TRT
Warning: No conversion function registered for layer: FlattenConcat_TRT yet.
Converting concat_box_loc as custom op: FlattenConcat_TRT
No. nodes: 781
UFF Output written to /workspace/tensorrt/samples/python/uff_ssd/utils/../workspace/models/ssd_inception_v2_coco_2017_11_17/frozen_inference_graph.uff
UFF Text Output written to /workspace/tensorrt/samples/python/uff_ssd/utils/../workspace/models/ssd_inception_v2_coco_2017_11_17/frozen_inference_graph.pbtxt
TensorRT inference engine settings:
  * Inference precision - DataType.FLOAT
  * Max batch size - 1

Loading cached TensorRT engine from /workspace/tensorrt/samples/python/uff_ssd/utils/../workspace/engines/FLOAT/engine_bs_1.buf
TensorRT inference time: 6 ms
Detected kite with confidence 97%
Detected person with confidence 91%
Detected kite with confidence 89%
Detected person with confidence 89%
Detected kite with confidence 83%
Detected kite with confidence 82%
Detected person with confidence 76%
Detected kite with confidence 74%
Detected person with confidence 70%
Detected person with confidence 62%
Detected person with confidence 59%
Total time taken for one image: 188 ms

Saved output image to: /workspace/tensorrt/samples/python/uff_ssd/utils/../image_inferred.jpg

17 ls workspace/models/ssd_inception_v2_coco_2017_11_17/
18 mv image_inferred.jpg /workspace/nvesk/image2.jpg
#This time the inference is much faster, since it isn’t converting the uff
19 python detect_objects.py images/image1.jpg
output:

TensorRT inference engine settings:
  * Inference precision - DataType.FLOAT
  * Max batch size - 1

Loading cached TensorRT engine from /workspace/tensorrt/samples/python/uff_ssd/utils/../workspace/engines/FLOAT/engine_bs_1.buf
TensorRT inference time: 6 ms
Detected dog with confidence 98%
Detected dog with confidence 93%
Detected person with confidence 75%
Total time taken for one image: 69 ms

Saved output image to: /workspace/tensorrt/samples/python/uff_ssd/utils/../image_inferred.jpg

20 mv image_inferred.jpg /workspace/nvesk/image1.jpg

varun365 · January 31, 2019, 10:07am

Hi

Thank you very much for the help, we appreciate it a lot.

However if you observe in code segment line

54.Loading cached TensorRT engine from /workspace/tensorrt/samples/python/uff_ssd/utils/../workspace/engines/FLOAT/engine_bs_1.buf

it is still loading the engine created for pre-trained model, using the cached engine to run inference.

When we checked the same code for custom trained pb file after deleting the engine (/workspace/engines folder that is created for pre-trained model).
The code should take the custom.pb as input, convert to uff, build the engine(rather than using cached engine) and then perform inference.
But it is not able to create the TensorRT engine for custom.pb file
we are stuck at that part and getting the same error quoted below

[TensorRT] ERROR: Parameter check failed at: ../builder/Layers.h::setAxis::315, condition: axis>=0
[TensorRT] ERROR: Concatenate/concat: all concat input tensors must have the same dimensions except on the concatenation axis
[TensorRT] ERROR: UFFParser: Parser error: BoxPredictor_0/ClassPredictor/BiasAdd: The input to the Scale Layer is required to have a minimum of 3 dimensions.
Building TensorRT engine. This may take few minutes.
[TensorRT] ERROR: Network must have at least one output
Traceback (most recent call last):
  File "detect_objects.py", line 193, in <module>
    main()
  File "detect_objects.py", line 166, in main
    batch_size=parsed['max_batch_size'])
  File "/workspace/teai/TensorRT/TensorRT-5.0.2.6/targets/x86_64-linux-gnu/samples/python/uff_ssd/utils/inference.py", line 69, in __init__
    engine_utils.save_engine(self.trt_engine, trt_engine_path)
  File "/workspace/teai/TensorRT/TensorRT-5.0.2.6/targets/x86_64-linux-gnu/samples/python/uff_ssd/utils/engine.py", line 83, in save_engine
    buf = engine.serialize()
AttributeError: 'NoneType' object has no attribute 'serialize'

Could you try the inference on custom.pb after deleting the ‘workspace/engines’ folder ?

sandeep1995s · February 5, 2019, 12:02pm

Hi team,

We are trying to optimize the custom graph(ssd_inception) using tensorrt 5.0 . Did you get a chance to go through the above mentioned error.
We are stuck at this point.

NVES_K · February 5, 2019, 5:31pm

I’m seeing the same errors now. The converter is trying to handle layers it doesn’t really know what to do with. That’s the series of warnings beginning with “WARNING: To create TensorRT plugin nodes, please use the create_plugin_node function instead.” The parameter check errors are probably a result of that. Try converting each of those layers to a customer layer as a first step. That should clear those warnings.

sandeep1995s · February 6, 2019, 5:23am

Hi NVES_K,

Thanks for the valuable suggestion, will work on the same. I am just curious that TensorRT was working fine for the pre-trained SSD inception model, we didn’t made any alterations in layers, just used a different dataset(no.of output classes=6).
Instead of initialising weights randomly, we initialised with pre-trained checkpoint weights. I remember no other layer changes were made.
So if it could work well for pre-trained model, it should do fine on custom-trained model also.
We will further explore on the custom layer implementation in TensorRT. Meanwhile if you could find any help/solution do let us know.
Thanks in advance.

zcy · May 7, 2019, 8:47am

Hi

Thank you very much for the help, we appreciate it a lot.

However if you observe in code segment line
54.Loading cached TensorRT engine from /workspace/tensorrt/samples/python/uff_ssd/utils/../workspace/engines/FLOAT/engine_bs_1.buf
it is still loading the engine created for pre-trained model, using the cached engine to run inference.

When we checked the same code for custom trained pb file after deleting the engine (/workspace/engines folder that is created for pre-trained model).
The code should take the custom.pb as input, convert to uff, build the engine(rather than using cached engine) and then perform inference.
But it is not able to create the TensorRT engine for custom.pb file
we are stuck at that part and getting the same error quoted below
[TensorRT] ERROR: Parameter check failed at: ../builder/Layers.h::setAxis::315, condition: axis>=0
[TensorRT] ERROR: Concatenate/concat: all concat input tensors must have the same dimensions except on the concatenation axis
[TensorRT] ERROR: UFFParser: Parser error: BoxPredictor_0/ClassPredictor/BiasAdd: The input to the Scale Layer is required to have a minimum of 3 dimensions.
Building TensorRT engine. This may take few minutes.
[TensorRT] ERROR: Network must have at least one output
Traceback (most recent call last):
  File "detect_objects.py", line 193, in <module>
    main()
  File "detect_objects.py", line 166, in main
    batch_size=parsed['max_batch_size'])
  File "/workspace/teai/TensorRT/TensorRT-5.0.2.6/targets/x86_64-linux-gnu/samples/python/uff_ssd/utils/inference.py", line 69, in __init__
    engine_utils.save_engine(self.trt_engine, trt_engine_path)
  File "/workspace/teai/TensorRT/TensorRT-5.0.2.6/targets/x86_64-linux-gnu/samples/python/uff_ssd/utils/engine.py", line 83, in save_engine
    buf = engine.serialize()
AttributeError: 'NoneType' object has no attribute 'serialize'
Could you try the inference on custom.pb after deleting the ‘workspace/engines’ folder ?

Linux version : Ubuntu 16.04 LTS
GPU type : GeForce GTX 1080
nvidia driver version : 410.93
CUDA version : 10.0
CUDNN version : 7.4.1
Python version [Anaconda] : 3.6.8
Tensorflow version : tensorflow-gpu 1.13.1
TensorRT version : 5.1.2.2

[libprotobuf FATAL /home/erisuser/p4sw/sw/gpgpu/MachineLearning/DIT/externals/protobuf/x86_64/10.0/include/google/protobuf/repeated_field.h:1408] CHECK failed: (index) < (current_size_): 
Traceback (most recent call last):
  File "detect_objects.py", line 245, in <module>
    main()
  File "detect_objects.py", line 219, in main
    batch_size=args.max_batch_size)
  File "/home/zcy/1_data_sets/TensorRT-5.1.2.2/targets/x86_64-linux-gnu/samples/python/uff_ssd/utils/inference.py", line 115, in __init__
    batch_size=batch_size)
  File "/home/zcy/1_data_sets/TensorRT-5.1.2.2/targets/x86_64-linux-gnu/samples/python/uff_ssd/utils/engine.py", line 75, in build_engine
    parser.parse(uff_model_path, network)
RuntimeError: CHECK failed: (index) < (current_size_):

I am using the uff_ssd example in tensorrt5.1.2.2.
I trained my model with ‘ssd_inception_v2_coco.config’ in tensorflow-api. When I used the script detect_objects.py in uff_ssd, the model was converted from ‘.pb’ to ‘.uff’ and also generated '.pbtxt 'File.
But when building an engine, there is always the same error. The error message is as above.
When I use the default model ‘ssd_inception_v2_coco_2017_11_17’ in the uff_ssd script ‘detect_objects.py’, everything works fine.
Any help will be appreciated!!!

zcy · May 7, 2019, 8:48am

Linux version : Ubuntu 16.04 LTS
GPU type : GeForce GTX 1080
nvidia driver version : 410.93
CUDA version : 10.0
CUDNN version : 7.4.1
Python version [Anaconda] : 3.6.8
Tensorflow version : tensorflow-gpu 1.13.1
TensorRT version : 5.1.2.2

[libprotobuf FATAL /home/erisuser/p4sw/sw/gpgpu/MachineLearning/DIT/externals/protobuf/x86_64/10.0/include/google/protobuf/repeated_field.h:1408] CHECK failed: (index) < (current_size_): 
Traceback (most recent call last):
  File "detect_objects.py", line 245, in <module>
    main()
  File "detect_objects.py", line 219, in main
    batch_size=args.max_batch_size)
  File "/home/zcy/1_data_sets/TensorRT-5.1.2.2/targets/x86_64-linux-gnu/samples/python/uff_ssd/utils/inference.py", line 115, in __init__
    batch_size=batch_size)
  File "/home/zcy/1_data_sets/TensorRT-5.1.2.2/targets/x86_64-linux-gnu/samples/python/uff_ssd/utils/engine.py", line 75, in build_engine
    parser.parse(uff_model_path, network)
RuntimeError: CHECK failed: (index) < (current_size_):

I am using the uff_ssd example in tensorrt5.1.2.2.
I trained my model with ‘ssd_inception_v2_coco.config’ in tensorflow-api. When I used the script detect_objects.py in uff_ssd, the model was converted from ‘.pb’ to ‘.uff’ and also generated '.pbtxt 'File.
But when building an engine, there is always the same error. The error message is as above.
When I use the default model ‘ssd_inception_v2_coco_2017_11_17’ in the uff_ssd script ‘detect_objects.py’, everything works fine.
Any help will be appreciated!!!

zcy · June 3, 2019, 9:58am

Hi

Thank you very much for the help, we appreciate it a lot.

However if you observe in code segment line
54.Loading cached TensorRT engine from /workspace/tensorrt/samples/python/uff_ssd/utils/../workspace/engines/FLOAT/engine_bs_1.buf
it is still loading the engine created for pre-trained model, using the cached engine to run inference.

When we checked the same code for custom trained pb file after deleting the engine (/workspace/engines folder that is created for pre-trained model).
The code should take the custom.pb as input, convert to uff, build the engine(rather than using cached engine) and then perform inference.
But it is not able to create the TensorRT engine for custom.pb file
we are stuck at that part and getting the same error quoted below
[TensorRT] ERROR: Parameter check failed at: ../builder/Layers.h::setAxis::315, condition: axis>=0
[TensorRT] ERROR: Concatenate/concat: all concat input tensors must have the same dimensions except on the concatenation axis
[TensorRT] ERROR: UFFParser: Parser error: BoxPredictor_0/ClassPredictor/BiasAdd: The input to the Scale Layer is required to have a minimum of 3 dimensions.
Building TensorRT engine. This may take few minutes.
[TensorRT] ERROR: Network must have at least one output
Traceback (most recent call last):
  File "detect_objects.py", line 193, in <module>
    main()
  File "detect_objects.py", line 166, in main
    batch_size=parsed['max_batch_size'])
  File "/workspace/teai/TensorRT/TensorRT-5.0.2.6/targets/x86_64-linux-gnu/samples/python/uff_ssd/utils/inference.py", line 69, in __init__
    engine_utils.save_engine(self.trt_engine, trt_engine_path)
  File "/workspace/teai/TensorRT/TensorRT-5.0.2.6/targets/x86_64-linux-gnu/samples/python/uff_ssd/utils/engine.py", line 83, in save_engine
    buf = engine.serialize()
AttributeError: 'NoneType' object has no attribute 'serialize'
Could you try the inference on custom.pb after deleting the ‘workspace/engines’ folder ?

If your model has different input_shape, you can change the input_sahpe in utils/model.py

fuatka · June 18, 2019, 1:22pm

Hi varun365,

I am in same exact situation,
Do you have any solution this problems?

ikuyasam18 · December 17, 2019, 5:33am

Hi NVES_K,

Do you have any solution this problems?

yorkleesiat · December 28, 2019, 8:13am

my problem is same?? any one can help???

czksnk · February 28, 2020, 2:02pm

me too

yahya_qlue · May 14, 2020, 10:11am

I have the same problem after retraining with 1 class

Topic		Replies	Views
Tensorrt support for SSD_inception trained on custom dataset TensorRT	15	2619	October 12, 2021
sampleUffSSD conversion fails? (KeyError: 'image_tensor') TensorRT	22	4140	October 12, 2021
sampleUffSSD with custom ssd_mobilenet_v1 model TensorRT	37	4490	October 12, 2021
TensorRT fails to build FasterRCNN GIE model with using INT8 TensorRT	28	9206	May 3, 2018
Getting error while converting custom model using faster rcnn resnet 50 to tensor rt engine using tensor rt 5.0 TensorRT	17	3471	September 26, 2020
How to retrain ssd_inception_v2_coco_2017_11_17 from the tensorrt - samples Jetson TX2	33	6947	October 18, 2021
[TensorRT] ERROR: Parameter check failed at: Utils.cpp::reshapeWeights::71, condition: input.values != nullptr TensorRT	13	5617	October 12, 2021
TensorRT (TF-TRT) doesn't improve TF model in GeForce 1060? TensorRT	7	2905	January 18, 2019
How adapt Tensorflow object detection for custom dataset to Deepstream 5.0 DeepStream SDK tensorflow	17	1978	July 27, 2021
Parsing GridAnchor[Op: _GridAnchor_TRT]. ... /protobuf/repeated_field.h:1408] CHECK failed: (index) < (current_size_): TensorRT	30	9542	October 12, 2021

Tensorrt fails for custom ssd_inception Model

I only commented out line 238: #download_model(model_name, silent)

Related topics