TensorRT 3.0.2 API clarification

paulkwon · February 7, 2018, 12:42am

Hi, I would like to clarify TensorRT APIs.

When converting from tensorflow model, there can be two outputs.
A UFF file from “uff.from_tensorflow()” and an engine file from “trt.utils.write_engine_to_file()” (see codes below).

Which one is used in C++ API?

uff_model = uff.from_tensorflow(output_graph, output_names, output_filename='abc.uff')  # UFF
parser = uffparser.create_uff_parser()
parser.register_input(...)
parser.register_output(...)
engine = trt.utils.uff_to_trt_engine(G_LOGGER, uff_model, parser, 10, 1 << 20)
parser.destroy()
trt.utils.write_engine_to_file("abc.engine", engine.serialize())               # engine

In “sampleUffMNIST.cpp”, it says,

ICudaEngine* engine = loadModelAndCreateEngine("lenet5.uff", maxBatchSize, parser);

Does this mean C++ API use UFF files, not engine files?
Is there some C++ APIs to load an engine file and use it for inference?

AastaLLL · February 7, 2018, 8:35am

Hi,

UFF file is a data format that describes an execution graph for a DNN.
PLAN is the serialized data that the runtime engine uses to execute the network.

Check our document for information:
[url]Developer Guide :: NVIDIA Deep Learning TensorRT Documentation

C++ API can launch TensorRT from both UFF model and PLAN.
But please noticed that the PLAN file has hardware dependency.
You can’t launch a GPU=6.1 PLAN on a GPU=6.2 device.

Thanks.