TensorFlow Iterator and Tensor RT not optimizing correctly


I have trained a neural network as such:

labels = tf.data.FixedLengthRecordDataset(labelfiles, record_bytes)
dataset = tf.data.Dataset.zip((datain, labels))
dataset = dataset.prefetch(128)
dataset = dataset.shuffle(16)
dataset = dataset.repeat(10)
iterator = dataset.make_initializable_iterator()

sess = tf.Session()
[batch_x, batch_y] = iterator.get_next()

# Define model function (let's not debate model except as relevant to question)
def model_fn(xin):
    x0 = tf.transpose(tf.reshape(xin, [...], name='input'))
    w = tf.Variable(tf.truncated_normal([...], stddev=0.1))
    x1 = tf.nn.conv2d(x0, w, strides=[...], padding='VALID')
    b = tf.Variable(tf.constant(0.0, shape=[...]))
    x2 = tf.nn.bias_add(x1, b)
    x3 = tf.nn.relu(x2, name='output')
    return x3

# Setup training environment
model = model_fn(batch_x)
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits=model, labels=batch_y))
optimizer = tf.train.AdamOptimizer(learning_rate=1e-3).minimize(loss)

# Train Model
while True:
    except tf.errors.OutOfRangeError:

# Save model
saver = tf.train.Saver(name='saver')
saver.save(sess, 'temp/path')

I then convert the saved model to a UFF:

from tensorflow.python.tools import optimize_for_inference_lib
import uff

# You can feed data to the IteratorGetNext node using feed_dict
input_node_name = 'iterator_scope_name/IteratorGetNext'
output_node_name = 'model_scope_name/output'

# Freeze model and create a UFF file:
graph_def = graph.as_graph_def() # Convert the graph to a serialized pb
frozen_graph_def = tf.graph_util.convert_variables_to_constants(sess,
    graph_def, [output_node_name])
opt_graph_def = optimize_for_inference_lib.optimize_for_inference(
    frozen_graph_def, [input_node_name], [output_node_name],
uff.from_tensorflow(opt_graph_def, [output_node_name], quiet=False,

Then I run into an issue when I convert it to a PLAN file. The shuffle function from the iterator is still hanging around and the first layer doesn’t optimize. Here is my output for the layer:

--------------- Timing reshape/Reshape + reshape/transpose + (Unnamed Layer* 2) [Shuffle] + (Unnamed Layer* 3) Shuffle
Tactic 0 is the only option, timing skipped

What am I doing wrong that is allowing the shuffle operations to still hang around? The inference still works properly, but I would like to optimize all the layers if possible.


Can you please provide a sample UFF file that exhibits this issue? It’d help greatly to debug the issue.