Hi, I would like to clarify TensorRT APIs.
When converting from tensorflow model, there can be two outputs.
A UFF file from “uff.from_tensorflow()” and an engine file from “trt.utils.write_engine_to_file()” (see codes below).
Which one is used in C++ API?
uff_model = uff.from_tensorflow(output_graph, output_names, output_filename='abc.uff') # UFF
parser = uffparser.create_uff_parser()
parser.register_input(...)
parser.register_output(...)
engine = trt.utils.uff_to_trt_engine(G_LOGGER, uff_model, parser, 10, 1 << 20)
parser.destroy()
trt.utils.write_engine_to_file("abc.engine", engine.serialize()) # engine
In “sampleUffMNIST.cpp”, it says,
ICudaEngine* engine = loadModelAndCreateEngine("lenet5.uff", maxBatchSize, parser);
Does this mean C++ API use UFF files, not engine files?
Is there some C++ APIs to load an engine file and use it for inference?