TensorRT4 Error converting Onnx to TRT

Hello,

I have converted a Caffe2 inception model into ONNX, and am now trying to load the model and convert to TRT:

nvonnxparser::IOnnxConfig* config = nvonnxparser::createONNXConfig();
  config->setModelFileName(modelONNXFilePath.c_str());

  // create parser object
  nvonnxparser::IONNXParser* parser = nvonnxparser::createONNXParser(*config);
  gLogger = parser->getLogger();

  if(!parser->parse(modelONNXFilePath.c_str(), nvinfer1::DataType::kFLOAT)){

    std::string msg("failed to parse onnx file");
    gLogger->log(nvinfer1::ILogger::Severity::kERROR, msg.c_str());
    exit(EXIT_FAILURE);
  }
  if(!parser->convertToTRTNetwork()){

    std::string msg("ERROR, failed to convert onnx network into TRT network");
    gLogger->log(nvinfer1::ILogger::Severity::kERROR, msg.c_str());
    exit(EXIT_FAILURE);
  }
  network = parser->getTRTNetwork();

After running this on the ONNX model I get the error: “Attribute not found: shape”. I googled around for this error, and thus far it seems like I am using some layer that is not supported in ONNX. I peaked around and at the end of the network, we use a “Dropout” layer. From what I understand, during inference this layer essentially becomes a no-op. So I tried manually removing the op from my caffe2 protobuf file, and reconverted into ONNX, but I am still getting the ‘shape’ error.

So two questions:

  1. How can I improve debugging so I can see what exactly is the error?

  2. Is there something I am missing for converting this model?

Here is the model, which contains the ‘dropout’ layer: https://drive.google.com/open?id=1GtIEDEl_aZvmnf4hSHRpOz9XryWjKuFr

If needed I provide the onnx model for the version that I manually removed the dropout layer from the pbtxt file.

Thanks for any help!

Quick update. Ran a sample model completely removing the dropout layer, but still having the “Attribute not found: shape” error.

The layers we are using from Caffe2 are:
Conv, Relu, MaxPool, Concat, AveragePool, FC, and Softmax.

Maybe it is the name “FC”, that is causing the issue??

As far as I know, in ONNX this gets converted too:

-
 ^Hpool11_1
 ^KOC2_DUMMY_1^R^KOC2_DUMMY_0"^GReshape
 J
 ^KOC2_DUMMY_0
 ^Ffc_w_0
 ^Ffc_b_0^R^Dfc_1"^DGemm*^M
 ^FtransB^X^A ^A^B*^P
     broadcast^X^A ^A^B

After further looking (again), I found this issue from 15 days ago: Why is there an additional Reshape layer in LeNet's onnx pb file ? · Issue #1025 · onnx/onnx · GitHub

Apparently onnx is creating this layer to complement the FC layer, which converts things into a separate shape…

So most likely I will need to implement a custom layer using the plugin interface, but how to do this prior to loading the model using the convertToTRTNetwork for loading onnx?

Edit:

Another quick update, my colleague tried to remove the automatically generated layer by doing a squeeze operation to flatten the tensor to match the input from the GEMM layer. Using TensorRT 4, it shows support for Squeeze, but I am getting the following output:

convert_tensor softmax_1
convert_tensor fc_1
convert_tensor fc_input_1
input: "pool11_1"
output: "fc_input_1"
name: ""
op_type: "Squeeze"
attribute {
  name: "axes"
  ints: 2
  ints: 3
  type: INTS
}
terminate called after throwing an instance of 'std::out_of_range'
  what():  No converter registered for op type: Squeeze
E0611 10:47:14.960438   693 main.cpp:62] *** Aborted at 1528728434 (unix time) try "date -d @1528728434" if you are using GNU date ***
E0611 10:47:14.961580   693 main.cpp:62] PC: @     0x7f1f927d0428 gsignal
E0611 10:47:14.961833   693 main.cpp:62] *** SIGABRT (@0x138a000002b5) received by PID 693 (TID 0x7f1fa7af6740) from PID 693; stack trace: ***
E0611 10:47:14.962733   693 main.cpp:62]     @     0x7f1f93417390 (unknown)
E0611 10:47:14.963565   693 main.cpp:62]     @     0x7f1f927d0428 gsignal
E0611 10:47:14.964376   693 main.cpp:62]     @     0x7f1f927d202a abort
E0611 10:47:14.965319   693 main.cpp:62]     @     0x7f1f92e0a84d __gnu_cxx::__verbose_terminate_handler()
E0611 10:47:14.966382   693 main.cpp:62]     @     0x7f1f92e086b6 (unknown)
E0611 10:47:14.967200   693 main.cpp:62]     @     0x7f1f92e08701 std::terminate()
E0611 10:47:14.968183   693 main.cpp:62]     @     0x7f1f92e08919 __cxa_throw
E0611 10:47:14.969714   693 main.cpp:62]     @     0x7f1f99509119 nvonnxparser::Converter::convert_node()
E0611 10:47:14.971324   693 main.cpp:62]     @     0x7f1f9950b1d1 nvonnxparser::Converter::convert_tensor_or_weights()
E0611 10:47:14.976142   693 main.cpp:62]     @     0x7f1f99500dd1 nvonnxparser::convert_Gemm()
E0611 10:47:14.977587   693 main.cpp:62]     @     0x7f1f99508a95 nvonnxparser::Converter::convert_node()
E0611 10:47:14.979106   693 main.cpp:62]     @     0x7f1f9950b1d1 nvonnxparser::Converter::convert_tensor_or_weights()
E0611 10:47:14.980360   693 main.cpp:62]     @     0x7f1f9950405d nvonnxparser::convert_Softmax()
E0611 10:47:14.981259   693 main.cpp:62]     @     0x7f1f99511752 nvonnxparser::parserONNX::convertToTRTNetwork()

Hello,

Any updates since? I have similar issues, having run a PyTorch - ONNX conversion for a resnet model.