Engine build crashes after parsing a simple "OneHot" model

I’m using JetPack 4.1.1 DVP on a Xavier board running TensorRT 5.0.0

A TF model I’m trying to integrate in TensorRT requires the missing “_OneHot” operation. I have written a plugin to implement it but I cannot get it to work and it ends up failing at the “Formats and tactics selection” step:

../builder/cudnnBuilder2.cpp:795: virtual std::vector<nvinfer1::query::RequirementsCombination> nvinfer1::builder::EngineTacticSupply::getSupportedFormats(const nvinfer1::builder::Node&, const nvinfer1::query::Ports<nvinfer1::query::AbstractTensor>&): Assertion `!formats.empty()' failed.

I created a simple TensorFlow model using a OneHot operation as follows:

import uff
import numpy as np
import tensorflow as tf

DEPTH = 10
ON_VALUE = 1.0
OFF_VALUE = -1.0

x = np.array([[11, 0, 3, 4, 7, 3, 8, 10, 5, 4, 6, 5]])

input = tf.placeholder(tf.int32, shape=x.shape, name='input')
onehot = tf.one_hot(input, DEPTH, on_value=ON_VALUE, off_value=OFF_VALUE,  name='output')

with tf.Session() as sess:
    out = sess.run(onehot, feed_dict = {input: x})
    print(out)
    g = tf.get_default_graph()
    print(g.get_operations())

uff_model = uff.from_tensorflow(
    tf.get_default_graph(),
    output_nodes=['output'],
    output_filename='../data/model-onehot.uff',
    text=False,
)

It generates the model and creates the “model-onehot.uff’” file properly. I then try to load it in a program using the UFF parser as follows:

static const int INPUT_LENGTH = 12;
static int32_t input[INPUT_LENGTH] = {
  9, 0, 3, 4, 7, 3, 8, 2, 1, 4, 6, 5
};

int main(...)
{
  IRuntime *runtime = nvinfer1::createInferRuntime(gLogger);
  IBuilder* builder = nvinfer1::createInferBuilder(gLogger);
  INetworkDefinition* network = builder->createNetwork();
  auto parser = nvuffparser::createUffParser();

  // parse UFF model
  parser->setPluginFactory(&pluginFactory);
  parser->registerInput("input", Dims2(1, INPUT_LENGTH), UffInputOrder::kNC);
  parser->registerOutput("output");
  parser->parse(MODEL_FILENAME, *network, nvinfer1::DataType::kFLOAT);

  // configure and build engine
  builder->setMaxBatchSize(1);
  builder->setMaxWorkspaceSize(1 << 30);
  ICudaEngine* engine = builder->buildCudaEngine(*network);

...

When doing so, the program crashes when calling "buildCudaEngine"by outputting the error message above. The OneHot plugin I wrote is mostly empty for now:

OneHotLayer::OneHotLayer() {
}

OneHotLayer::OneHotLayer(const void* buffer, size_t size) {
}

OneHotLayer::OneHotLayer(const Weights* weights, int nbWeights) {
  assert(weights[0].type == DataType::kINT32);
  assert(weights[0].count == 1);
  mDepth = *(reinterpret_cast<const int32_t*>(weights[0].values));

  assert(weights[1].type == DataType::kFLOAT);
  assert(weights[1].count == 1);
  mOnValue = *(reinterpret_cast<const float*>(weights[1].values));

  assert(weights[2].type == DataType::kFLOAT);
  assert(weights[2].count == 1);
  mOffValue = *(reinterpret_cast<const float*>(weights[2].values));
}

Dims OneHotLayer::getOutputDimensions(int index, const Dims* inputs, int nbInputDims) {
  assert((nbInputDims > 0) && (inputs[0].nbDims == 2));

  return Dims3(inputs[0].d[0], inputs[0].d[1], mDepth);
}

int OneHotLayer::initialize() {
  return 0;
}

int OneHotLayer::enqueue(int batchSize, const void* const *inputs, void** outputs, void *workspace,
    cudaStream_t stream) {
  return 0;
}

size_t OneHotLayer::getSerializationSize() {
  return 0;
}

void OneHotLayer::serialize(void* buffer) {
}

void OneHotLayer::configure(const Dims* inputs, int nbInputs, const Dims* outputs, int nbOutputs,
    int maxBatchSize) {
  assert((nbInputs > 0) || (nbOutputs > 0));

  mInputDim = Dims2(inputs[0].d[0], inputs[0].d[1]);
  mOutputDim = Dims3(1, inputs[0].d[1], mDepth);
}

Please help, any feedback on what the error actually means would help. Thanks.

reviewing now. will keep you updated on what we find

FYI, I upgraded to TensorRT 5.0.3 but still getting the same issue:

UFFParser: parsing input
UFFParser: parsing output/depth
UFFParser: parsing output/on_value
UFFParser: parsing output/off_value
UFFParser: parsing output
UFFParser: Parse Plugin node output
UFFParser: PluginNode Input Descriptor output/depth
UFFParser: Operation of node Const
UFFParser: Attempting to convert Const Op to Weight output/depth
UFFParser: PluginNode Input Descriptor output/on_value
UFFParser: Operation of node Const
UFFParser: Attempting to convert Const Op to Weight output/on_value
UFFParser: PluginNode Input Descriptor output/off_value
UFFParser: Operation of node Const
UFFParser: Attempting to convert Const Op to Weight output/off_value
mDepth=10
mOnValue=1.00
mOffValue=-1.00
UFFParser: parsing MarkOutput_0
Model parsed
******************************
Layers running on DLA:

******************************

******************************
Layers running on GPU:
output/depth, output/on_value, output/off_value, _OneHot_output,
******************************
Original: 4 layers
After dead-layer removal: 4 layers
After scale fusion: 4 layers
After vertical fusions: 4 layers
After swap: 4 layers
After final dead-layer removal: 4 layers
After tensor merging: 4 layers
After concat removal: 4 layers
Graph construction and optimization completed in 0.000765955 seconds.
unit-tests: ../builder/cudnnBuilder2.cpp:834: virtual std::vector<nvinfer1::query::RequirementsCombination> nvinfer1::builder::EngineTacticSupply::getSupportedFormats(const nvinfer1::builder::Node&, const nvinfer1::query::Ports<nvinfer1::query::AbstractTensor>&): Assertion `!formats.empty()' failed.
Aborted

hello,

Per engineering:
Which plugin interface is this using? IPluginExt or IPluginV2 is recommended, and I don’t see supportsFormat() and configureWithFormat() implemented in the code .

I implemented IPlugin, inspiring from the only example I could find: GitHub - AastaNV/Face-Recognition: Demonstrate Plugin API for TensorRT2.1

I’m using the IUffParser interface which does not seem to support IPluginV2, only IPlugin and IPluginExt. Should I then implement IPluginExt instead of IPlugin? If so, do you have any examples available?

Thanks a lot for your support.

Hello,

Per engineering, please take a look at TRT samples/samplePlugin, which uses IPluginExt.

regards,
NVES

I modified the plugin and test code to implement the IPluginExt interface. I inspired from the sample code you mentioned even though this one applies to the Caffeparser class.

The plugin now looks like this:

class OneHotLayer : public IPluginExt {
 public:
  OneHotLayer(const void* buffer, size_t size);
  OneHotLayer(const Weights* weights, int nbWeights);

  bool supportsFormat(DataType type, PluginFormat format) const override;
  void configureWithFormat(const Dims *inputDims, int nbInputs, const Dims *outputDims, int nbOutputs, DataType type,
      PluginFormat format, int maxBatchSize) override;

  int getNbOutputs() const override;
  Dims getOutputDimensions(int index, const Dims* inputs, int nbInputDims) override;

  int initialize() override;
  inline void terminate() override {}

  inline size_t getWorkspaceSize(int) const override { return 0; }
  int enqueue(int batchSize, const void* const *inputs, void** outputs, void*, cudaStream_t stream) override;

  size_t getSerializationSize() override;
  void serialize(void* buffer) override;

  void configure(const Dims*inputs, int nbInputs, const Dims* outputs, int nbOutputs, int maxBatchSize) override;

 protected:
  Dims2 mInputDim;
  Dims3 mOutputDim;
  int32_t mDepth;
  float mOnValue;
  float mOffValue;
};
OneHotLayer::OneHotLayer(const void* buffer, size_t size) {
  std::cout << "OneHotLayer::OneHotLayer(const void* buffer, size_t size)" << std::endl;
}

OneHotLayer::OneHotLayer(const Weights* weights, int nbWeights) {
  assert(weights[0].type == DataType::kINT32);
  assert(weights[0].count == 1);
  mDepth = *(reinterpret_cast<const int32_t*>(weights[0].values));

  assert(weights[1].type == DataType::kFLOAT);
  assert(weights[1].count == 1);
  mOnValue = *(reinterpret_cast<const float*>(weights[1].values));

  assert(weights[2].type == DataType::kFLOAT);
  assert(weights[2].count == 1);
  mOffValue = *(reinterpret_cast<const float*>(weights[2].values));

  std::cout << "mDepth=" << mDepth << std::endl;
  std::cout << "mOnValue=" << mOnValue << std::endl;
  std::cout << "mOffValue=" << mOffValue << std::endl;
}

int OneHotLayer::getNbOutputs() const {
  return 1;
}

bool OneHotLayer::supportsFormat(DataType type, PluginFormat format) const {
  std::cout << "supportsFormat" << std::endl;
  return true;
}

void OneHotLayer::configureWithFormat(const Dims *inputDims, int nbInputs, const Dims *outputDims, int nbOutputs,
    DataType type, PluginFormat format, int maxBatchSize) {
  std::cout << "configureWithFormat" << std::endl;
}

Dims OneHotLayer::getOutputDimensions(int index, const Dims* inputs, int nbInputDims) {
  assert((nbInputDims > 0) && (inputs[0].nbDims == 2));

  std::cout << "d0=" << inputs[0].d[0] << " - d1=" << inputs[0].d[1] << std::endl;

  return Dims3(inputs[0].d[0], inputs[0].d[1], mDepth);
}

int OneHotLayer::initialize() {
  std::cout << "initialize" << std::endl;
  return 0;
}

int OneHotLayer::enqueue(int batchSize, const void* const *inputs, void** outputs, void *workspace,
    cudaStream_t stream) {

  std::cout << "enqueue" << std::endl;
  return 0;
}

size_t OneHotLayer::getSerializationSize() {
  std::cout << "getSerializationSize" << std::endl;
  return 0;
}

void OneHotLayer::serialize(void* buffer) {
  std::cout << "serialize" << std::endl;
}

void OneHotLayer::configure(const Dims* inputs, int nbInputs, const Dims* outputs, int nbOutputs,
    int maxBatchSize) {
  assert((nbInputs > 0) || (nbOutputs > 0));

  std::cout << "configure" << std::endl;

  mInputDim = Dims2(inputs[0].d[0], inputs[0].d[1]);
  mOutputDim = Dims3(1, inputs[0].d[1], mDepth);
}

I also changed the test program to load the new PluginFactory

PluginFactory pluginFactory;

void main(void)
{
  IRuntime *runtime = nvinfer1::createInferRuntime(gLogger);
  IBuilder* builder = nvinfer1::createInferBuilder(gLogger);
  INetworkDefinition* network = builder->createNetwork();
  auto parser = nvuffparser::createUffParser();

  // parse UFF model
  parser->setPluginFactoryExt(&pluginFactory);
  parser->registerInput("input", Dims2(1, INPUT_LENGTH), UffInputOrder::kNC);
  parser->registerOutput("output");
  parser->parse(MODEL_FILENAME, *network, nvinfer1::DataType::kFLOAT);

  std::cout << "Model parsed" << std::endl;

...
}

When I run the test program, I get a segfault with the following backtrace:

UFFParser: parsing input
UFFParser: parsing output/depth
UFFParser: parsing output/on_value
UFFParser: parsing output/off_value
UFFParser: parsing output
UFFParser: Parse Plugin node output
UFFParser: PluginNode Input Descriptor output/depth
UFFParser: Operation of node Const
UFFParser: Attempting to convert Const Op to Weight output/depth
UFFParser: PluginNode Input Descriptor output/on_value
UFFParser: Operation of node Const
UFFParser: Attempting to convert Const Op to Weight output/on_value
UFFParser: PluginNode Input Descriptor output/off_value
UFFParser: Operation of node Const
UFFParser: Attempting to convert Const Op to Weight output/off_value
mDepth=10
mOnValue=1
mOffValue=0

Thread 1 "unit-tests" received signal SIGSEGV, Segmentation fault.
0x0000007fb05e2960 in nvinfer1::Network::addPluginExt(nvinfer1::ITensor* const*, int, nvinfer1::IPluginExt&) () from /usr/lib/aarch64-linux-gnu/libnvinfer.so.5

As you can see the error seems to occur during parsing since the following trace is not printed out:

std::cout << "Model parsed" << std::endl;

Note: the TF model did not change, it’s the same as the TF code posted above.

Let me know if you need any more debug info to understand what’s going on. Thanks for the support.

I don’t see any crashes anymore after I migrated my plugin to the IPluginV2/IPluginCreator interface. Thanks a lot for the support.