A number of detectors, such as RT-DETRv2 or D-FINE have an extra input layer in addition to the image input layer - typically these are the image sizes.
If such layers are present in the engine,
nvinfer will need a custom initialization function - as described here:
But the example that is given there,
objectDetector_FasterRCNN sample application
is not linked and doesnt seem to exist anymore?
Can you provide a simple example i.e. how to provide a custom library implementation to initialize an additional image_size layer in the model?
You can refer to the header file: /opt/nvidia/deepstream/deepstream/sources/includes/nvdsinfer_custom_impl.h
Please implement the layer value setting in with NvDsInferInitializeInputLayers interface and export into a dynamic library. The dynamic library(*.so) can be configured by “custom-lib-path” in the nvinfer configuration file.
You may also consider to use nvinferserver for such model, there is sample of setting extra input layers in /opt/nvidia/deepstream/deepstream/sources/TritonOnnxYolo
which specifies the custom parse function - but I dont see any option to specify an init function in that custom lib ?
I see the definition of NvDsInferInitializeInputLayers in the header and I defined this function in my custom library - but it doesnt get called.
Can you please outline how to set this up, so deepstream will use it at the first inference?
I also found the example here:
and my own setup looks somewhat similar - except for I am running deepstream 7.1.
I confirmed that NvDsInferInitializeInputLayers is indeed present in the custom library, I know the custom library gets used, as I produce some output from there at each inference.
But the initialization function never gets called.
The “NvDsInferInitializeInputLayers” interface does not need to be set explicitly. If there is implementation inside the custom lib, gst-nvinfer will use it automatically. gst-nvinfer plugin and its library are all open source, you can debug the logic by investigating the code. /opt/nvidia/deepstream/deepstream/sources/libs/nvdsinfer/nvdsinfer_context_impl.cpp
As you can see, trt reports correctly, 2 input layers, “images” and “orig_target_sizes”, but the init function only reports “orig_target_sizes”.
Why is the information the init function gets inconsistent with the output from TRT?
The same issue with the output layers - I have configured the model as a dynamic model, and it wont recognize the output layers, even if the whole input layer initialization isnt used - i.e when I provide the same model without the extra input layer.
Here we see the correct dimensions for the input layers reported , i.e.
There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks