TensorRT Half2 Accuracy Issue

Hi,

I was trying to implement a VGG16 model on TX1 using TensorRT1.0. With the help of built-in tutorial of GIE, I modified it for my own use. It works well for DataType::kFLOAT, but when I tried to use DataType::kHALF to speedup the process, it failed to pass the assertion, which seems to be some problems from dimension match.

Error message below:

caffe_parser: cudnnReformatLayer.cpp:31: virtual void nvinfer1::cudnn::ReformatLayer::execute(const nvinfer1::cudnn::CommonContext&): Assertion `in.getDimensions() == out.getDimensions()’ failed.

Can someone help?

Thanks.
Leo

Interesting findings!

I designed a smaller model which is actually a part of my original model, I found that it would work if we only have one output for this model when using Half2. Is this a bug?

Regards.
Leo

Hi,

Looks like your model has some not-supported format in fp16 mode.
Could you paste your prototxt file for us debugging?

Thanks.

name: “VGG_ILSVRC_16_layer_1”

input: “data”

input_shape {

dim: 1

dim: 3

dim: 321

dim: 321

}

layer {

bottom: “data”

top: “conv1_1”

name: “conv1_1”

type: “Convolution”

convolution_param {

num_output: 21

pad: 1

kernel_size: 3

}

}

layer {

bottom: “conv1_1”

top: “conv1_1”

name: “relu1_1”

type: “ReLU”

}

layer {

bottom: “conv1_1”

top: “conv1_2”

name: “conv1_2”

type: “Convolution”

convolution_param {

num_output: 26

pad: 1

kernel_size: 3

}

}

layer {

bottom: “conv1_2”

top: “conv1_2”

name: “relu1_2”

type: “ReLU”

}

layer {

bottom: “conv1_2”

top: “pool1”

name: “pool1”

type: “Pooling”

pooling_param {

pool: MAX

kernel_size: 2

stride: 2

}

}

layer {

bottom: “pool1”

top: “conv2_1”

name: “conv2_1”

type: “Convolution”

convolution_param {

num_output: 22

pad: 1

kernel_size: 3

}

}

layer {

bottom: “conv2_1”

top: “conv2_1”

name: “relu2_1”

type: “ReLU”

}

layer {

bottom: “conv2_1”

top: “conv2_2”

name: “conv2_2”

type: “Convolution”

convolution_param {

num_output: 28

pad: 1

kernel_size: 3

}

}

layer {

bottom: “conv2_2”

top: “conv2_2”

name: “relu2_2”

type: “ReLU”

}

layer {

bottom: “conv2_2”

top: “pool2”

name: “pool2”

type: “Pooling”

pooling_param {

pool: MAX

kernel_size: 2

stride: 2

}

}

layer {

bottom: “pool2”

top: “conv3_1”

name: “conv3_1”

type: “Convolution”

convolution_param {

num_output: 24

pad: 1

kernel_size: 3

}

}

layer {

bottom: “conv3_1”

top: “conv3_1”

name: “relu3_1”

type: “ReLU”

}

layer {

bottom: “conv3_1”

top: “conv3_2”

name: “conv3_2”

type: “Convolution”

convolution_param {

num_output: 17

pad: 1

kernel_size: 3

}

}

layer {

bottom: “conv3_2”

top: “conv3_2”

name: “relu3_2”

type: “ReLU”

}

layer {

bottom: “conv3_2”

top: “conv3_3”

name: “conv3_3”

type: “Convolution”

convolution_param {

num_output: 13

pad: 1

kernel_size: 3

}

}

layer {

bottom: “conv3_3”

top: “conv3_3”

name: “relu3_3”

type: “ReLU”

}

layer {

bottom: “conv3_3”

top: “pool3”

name: “pool3”

type: “Pooling”

pooling_param {

pool: MAX

kernel_size: 2

stride: 2

}

}

layer {

bottom: “pool3”

top: “Fea_P3”

name: “Fea_P3”

type: “Convolution”

convolution_param {

num_output: 18

pad: 1

kernel_size: 3

}

}

layer {

bottom: “Fea_P3”

top: “Fea_P3”

name: “relu_Fea_P3”

type: “ReLU”

}

layer {

bottom: “pool3”

top: “conv4_1”

name: “conv4_1”

type: “Convolution”

convolution_param {

num_output: 33

pad: 1

kernel_size: 3

}

}

layer {

bottom: “conv4_1”

top: “conv4_1”

name: “relu4_1”

type: “ReLU”

}

layer {

bottom: “conv4_1”

top: “conv4_2”

name: “conv4_2”

type: “Convolution”

convolution_param {

num_output: 20

pad: 1

kernel_size: 3

}

}

layer {

bottom: “conv4_2”

top: “conv4_2”

name: “relu4_2”

type: “ReLU”

}

layer {

bottom: “conv4_2”

top: “conv4_3”

name: “conv4_3”

type: “Convolution”

convolution_param {

num_output: 18

pad: 1

kernel_size: 3

}

}

layer {

bottom: “conv4_3”

top: “conv4_3”

name: “relu4_3”

type: “ReLU”

}

layer {

bottom: “conv4_3”

top: “pool4”

name: “pool4”

type: “Pooling”

pooling_param {

pool: MAX

kernel_size: 2

stride: 2

}

}

layer {

bottom: “pool4”

top: “Fea_P4”

name: “Fea_P4”

type: “Convolution”

convolution_param {

num_output: 28

pad: 1

kernel_size: 3

}

}

layer {

bottom: “Fea_P4”

top: “Fea_P4”

name: “relu_Fea_P4”

type: “ReLU”

}

Here I mark both Fea_P4 and Fea_P3 as output.

Hi,

Sorry for keeping you waiting.

This issue is fixed in our next release.
Please pay attention to our announcement and update.

Thanks.