How to build TRT engine with `NvDsInferCudaEngineGetFromTltModel`

Please provide complete information as applicable to your setup.

• GPU
• DeepStream 6.1.1
• TensorRT 8.4.1
**• NVIDIA GPU Driver Version 515.65.01 **
**• Question **

I have reimplemented the DeepStream Gst-nvinfer plugin and I am trying to build an engine from an etlt file. The engine is built and saved to disk successfully, but I fail to deserialize it. However, I am able to deserialize engines built by Gst-nvinfer.

The parameters I set in NvDsInferContextInitParams are the following:

gpuID: 0
useDLA: 0
dlaCore: 0
tltEncodedModelFilePath: /var/lib/cradle/models/peoplenet/resnet34_peoplenet_pruned_int8.etlt
tltModelKey: tlt_encode
inferInputDims: 3 544 960
networkMode: 0
int8CalibrationFilePath:
numOutputLayers: 2
outputLayerNames: output_bbox/BiasAdd output_cov/Sigmoid 
numOutputIoFormats: 0
outputIOFormats: 
numLayerDevicePrecisions: 0
layerDevicePrecisions: 

When deserializing the engine from disk I get this:

1: [stdArchiveReader.cpp::StdArchiveReader::30] Error Code 1: Serialization (Serialization assertion magicTagRead == kMAGIC_TAG failed.Magic tag does not match
)
4: [runtime.cpp::deserializeCudaEngine::50] Error Code 4: Internal Error (Engine deserialization failed.)

Do I have to set any other parameters in the config data structure?

Did you generate the engine and deserialize it with the same GPU and the same TensorRT version?

Yes, I have only one GPU and only one version of TensorRT installed.

The gst-nvinfer is totally open source and it is a complete sample for the engine build and deserialize, you can debug and compare with your code.

I am doing that and the single point of failure I can find right now is the parameters passed in NvDsInferContextInitParams. However, since the config parsing and NvDsInferContextInitParams build process is so convoluted I am not sure if I’m missing something. The source code is almost unreadable.

Since the NvDsInferCudaEngineGetFromTltModel method is opaque I need the list of parameters needed by it. The NvDsInferContextInitParams struct has a plethora of fields which are irrelevant for building the model.

You can compare the content of NvDsInferContextInitParams in nvinfer and your implementation to look for the differences.

Compare them how? This is not python, so I can not just call print(). Even if I print each field, not all of them are used by that function. Therefore, comparing the contents is meaningless.

Should I understand that no-one at Nvidia knows how that plugin works anymore? If so, please make NvDsInferCudaEngineGetFromTltModel open source and I’ll come back with an answer myself.

Opaque APIs need a good documentation. Telling people to look in the source code to figure out how to use it is not acceptable.

The code is also c/c++ but not python. It is proprietary, it will not be open source.

gst-nvinfer has already shown how to use the APIs.

Yes, you are completely right. It does show how to use the API over thousands of lines of code that seem to be written by an intern who has just discovered OOP.

There is no update from you for a period, assuming this is not an issue anymore.
Hence we are closing this topic. If need further support, please open a new one.
Thanks

If you want a simple sample for building engine, you can refer to the TensorRT samples. TensorRT/samples at main · NVIDIA/TensorRT · GitHub

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.