Hi Nvidia,
There is a bug in TensorRT. When at some stage in a layer of the model, the output of conv2d’s height or width becomes odd and not even, the last dense layer causes TensorRT to crash at the build stage when I run trt.utils.uff_to_trt_engine.
Here is the code:
config = {
'x_width': 256,
'x_height': 40,
'num_channels': 1,
'num_classes': 3,
}
def D(x, is_training = False, reuse = True):
with tf.variable_scope('Disc', reuse = reuse) as scope:
conv1 = tf.layers.conv2d(x, 32, [5, 5],
strides = [2, 2],
padding = 'same',
data_format = 'channels_first')
lrelu1 = tf.maximum(0.2 * conv1, conv1) # <= WON'T CRASH if you use this as output when building TensorRT
# layer2
conv2 = tf.layers.conv2d(lrelu1, 64, [3, 3],
strides = [2, 2],
padding = 'same',
data_format = 'channels_first')
batch_norm2 = tf.layers.batch_normalization(conv2, training = is_training, axis = 1)
lrelu2 = tf.maximum(0.2 * batch_norm2, batch_norm2) # <= WON'T CRASH if you use this as output when building TensorRT
# layer3
conv3 = tf.layers.conv2d(lrelu2, 128, [2, 2],
strides = [2, 2],
padding = 'same',
data_format = 'channels_first')
batch_norm3 = tf.layers.batch_normalization(conv3, training = is_training, axis = 1)
lrelu3 = tf.maximum(0.2 * batch_norm3, batch_norm3) # <= WON'T CRASH if you use this as output when building TensorRT
# layer4
conv4 = tf.layers.conv2d(lrelu3, 256, [2, 2],
strides = [2, 2],
padding = 'same',
data_format = 'channels_first')
lrelu4 = tf.maximum(0.2 * conv4, conv4) # <= WON'T CRASH if you use this as output when building TensorRT
# layer 5
flatten_length = lrelu4.get_shape().as_list()[1] * \
lrelu4.get_shape().as_list()[2] * lrelu4.get_shape().as_list()[3]
flatten5 = tf.reshape(lrelu4, (-1, flatten_length)) # <= WON'T CRASH if you use this as output when building TensorRT
fc5 = tf.layers.dense(flatten5, config['num_classes'])
output = tf.nn.softmax(fc5) # <= CRASHES if you use this as output when building TensorRT
assert output.get_shape()[1:] == [config['num_classes']]
return output
Please pay attention to comments in the code.
If you change config[‘x_height’] to 48. Then, it won’t crash and can build the model!
When config[‘x_height’]=40, the output of lrelu3 becomes [?, 128, 5, 32] (the output’s height is odd). This doesn’t cause problem if you use this output to build the tensorRT model. But, when you use the output of last layer, it crashes.
When config[‘x_height’]=48, the output height of all layers is even. So, you can build TensorRT model without any drama!
I tested this with a few other types of architectures and got the same result…