Hi,
I want to run inference on my Xavier.
I did the following:
On my working station:
- build a frozen graph (with tensorflow.tools.graph_transforms)
- generate an uff file (with uff.from_tensorflow)
- transfer this uff file from my working station to the Xavier.
- On the Xavier, convert the uff to plan by parsing the uff file, and running the following (c++):
Parse(uffFilename.c_str(), *network, dataType)
- then, builds the plan file/ engine (here is the part of the code):
if (!parser->parse(uffFilename.c_str(), *network, dataType))
{
cout << "Failed to parse UFF\n";
builder->destroy();
parser->destroy();
network->destroy();
return 1;
}
/* build engine */
if (dataType == DataType::kHALF)
builder->setHalf2Mode(true);
builder->setMaxBatchSize(maxBatchSize);
builder->setMaxWorkspaceSize(maxWorkspaceSize);
ICudaEngine *engine = builder->buildCudaEngine(*network);
/* serialize engine and write to file */
ofstream planFile;
planFile.open(planFilename);
IHostMemory *serializedEngine = engine->serialize();
planFile.write((char *)serializedEngine->data(), serializedEngine->size());
planFile.close();
/* break down */
builder->destroy();
parser->destroy();
network->destroy();
engine->destroy();
serializedEngine->destroy();
I have understood that uff will be deprecated.
My question is how to build the engine now with the TF-TRT library on my working station and building the engne on the Xavier.
I have tried the following:
From my working station:
with tf.Graph().as_default() as tf_graph:
with my_sess as tf_sess:
graph_size = len(self.frozen.SerializeToString())
num_nodes = len(self.frozen.node)
frozen_graph = trt.create_inference_graph(
input_graph_def=self.frozen,
outputs=self.output_layers_name,
max_batch_size=1,
max_workspace_size_bytes=0,
precision_mode='FP32',
minimum_segment_size=2,
is_dynamic_op=True,
maximum_cached_engines=100)
for n in frozen_graph.node:
if n.op == "TRTEngineOp":
print("Node: %s, %s" % (n.op, n.name.replace("/", "_")))
with tf.gfile.GFile("%s.plan" % (n.name.replace("/", "_")), 'wb') as f:
f.write(n.attr["serialized_segment"].s)
else:
print("Exclude Node: %s, %s" % (n.op, n.name.replace("/", "_")))
Now I have this “ .plan file”, how do I build the TensorRT engine on the Xavier?
I have read the documentation:
the data in the .plan file can then be provided to IRuntime::deserializeCudaEngine to use the engine in TensorRT.
Please can you help me with the steps to do to use the engine in TensorRT?
Thank you.