kernel crashes after executing trt.utils.uff_to_trt_engine

Hi,

I’ve a bit of problem. When I execute:

engine = trt.utils.uff_to_trt_engine(G_LOGGER, uff_model, parser, 19, 1 << 20)

in jupyter notebook, kernel crashes. When I run the py file, I get this error:

python3: Network.h:104: virtual nvinfer1::DimsHW nvinfer1::NetworkDefaultConvolutionFormula::compute(nvinfer1::DimsHW, nvinfer1::DimsHW, nvinfer1::DimsHW, nvinfer1::DimsHW, nvinfer1::DimsHW, const char*): Assertion `(input.w() + padding.w() * 2) >= dkw && “Image width with padding must always be at least the width of the dilated filter.”’ failed.
Aborted (core dumped)

What is dkw in Assertion `(input.w() + padding.w() * 2) >= dkw?

My conv has the following format:

conv1 = tf.layers.conv2d(x_t, 32, [7, 3],
                                 strides = [2, 2],
                                 padding = 'same',
                                 data_format='channels_first',
                                 kernel_initializer = tf.initializers.variance_scaling(0.005))

I don’t have any problem with building the network and training but it seems I’ve an issue with tensorrt! Any suggestion?! Is there something that hasn’t been supported?!

thanks…

Hi,

Could you share the size information of x_t?

From the log, TensorRT requires ( img_width + padding*2 ) > filter_width.
In your use-case, please make sure the width of x_t is larger than 7 to avoid this error.

Thanks.

Hi AsstaLLL,

I don’t see the problem!

Here is what I have:

First layer:

conv1 = tf.layers.conv2d(x_t, 32, [3, 7],       # x_t is [?, 1, 40, 256]
                                 strides = [2, 2],
                                 padding = 'same',
                                 data_format = 'channels_first',
                                 kernel_initializer = tf.initializers.variance_scaling(config['kernel_init_scale']))

Second layer:

conv2 = tf.layers.conv2d(lrelu1, 64, [2, 5],     # lrelu1 is [?, 32, 20, 128]
                                 strides = [2, 2],
                                 padding = 'same',
                                 data_format = 'channels_first',
                                 kernel_initializer = tf.initializers.variance_scaling(config['kernel_init_scale']))

Third layer:

conv3 = tf.layers.conv2d(lrelu2, 128, [3, 3],    # lrelu2 is [?, 64, 10, 64]
                                 strides = [2, 2],
                                 padding = 'same',
                                 data_format = 'channels_first',
                                 kernel_initializer = tf.initializers.variance_scaling(config['kernel_init_scale']))

Forth layer:

conv4 = tf.layers.conv2d(lrelu3, 256, [2, 2],    # lrelu3 is [?, 128, 5, 32]
                                 strides = [2, 2],
                                 padding = 'same',
                                 data_format = 'channels_first',
                                 kernel_initializer = tf.initializers.variance_scaling(config['kernel_init_scale']))

Fifth layer:

conv5 = tf.layers.conv2d(lrelu4, 256, [2, 2],    # lrelu4 is [?, 256, 3, 16]
                                 strides = [2, 2],
                                 padding = 'same',
                                 data_format = 'channels_first',
                                 kernel_initializer = tf.initializers.variance_scaling(config['kernel_init_scale']))

Sixth layer:

fc6 = tf.layers.dense(flatten6, (config['num_classes'] + 1),  # flatten6 is [?, 256*2*8]
                              kernel_initializer = tf.initializers.variance_scaling(config['kernel_init_scale']))

Is it a bug in TensorRT code?! Can someone please reply?! Thanks!

Hi,

We have tested a simple network with your first layer.
TensorRT can create engine successfully without the error you mentioned.

Could you share your uffparser configuration with us?

Here is our testing code for your reference:

import tensorflow as tf
import uff
import tensorrt as trt
from tensorrt.parsers import uffparser


MAX_WORKSPACE = 1 << 20
MAX_BATCHSIZE = 1
G_LOGGER = trt.infer.ConsoleLogger(trt.infer.LogSeverity.INFO)

inputs = tf.placeholder(dtype=tf.float32, shape=[1,1,40,256])
output = tf.layers.conv2d(inputs, 32, [3, 7],
                                 strides = [2, 2],
                                 padding = 'same',
                                 data_format = 'channels_first')
output = tf.nn.relu(output, name='out')

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    graphdef = tf.get_default_graph().as_graph_def()
    frozen_graph = tf.graph_util.convert_variables_to_constants(sess, graphdef, ['out'])
    tf_model = tf.graph_util.remove_training_nodes(frozen_graph)

uff_model = uff.from_tensorflow(tf_model, ['out'])

parser = uffparser.create_uff_parser()
parser.register_input("Placeholder", (1,40,256), 0)
parser.register_output("out")

engine = trt.utils.uff_to_trt_engine(G_LOGGER, uff_model, parser, MAX_BATCHSIZE, MAX_WORKSPACE, trt.infer.DataType.FLOAT)

Thanks.

1 Like

Hi AstaLLL,

Thank you for the test code. I ran your code and it worked on my machine. Then, I changed gradually your code to my code and I accidentally figured out what’s wrong. My model receives an input of NHWC format. Then, I use tf.transpose and convert it to NCHW format before I pass it to the first layer of convolution and because of that I was using:

parser.register_input('Placeholder', (config['x_height'],
                                      config['x_width'],
                                      config['num_channels']), 0)

and the code was crashing. Then I had a look at https://docs.nvidia.com/deeplearning/sdk/tensorrt-api/topics/topics/pkg_ref/parsers.html

and figured out that in register_input(), the input Dimensions, should always provide dimensions in CHW format even if the network input is in HWC format in the original framework! So, by changing that to:

parser.register_input('Placeholder', (config['num_channels'],
                                      config['x_height'],
                                      config['x_width']), 0)

now it works!

however, I think would be good if you guys add some comments to https://docs.nvidia.com/deeplearning/sdk/tensorrt-api/topics/topics/workflows/tf_to_tensorrt.html and mention it there as that’s a very weird assumption and that’s probabily the main page people go to first.

Thanks again for your help. Appreciate…

Thanks for the feedback.
Good to know it works now. :)