Why is it failed to convert onnx to engine in TensorRT 7.0.0.11?

Hi,
I’m trying to use tensorRT on Windows(with TensorRT 7.0.0.11, win10).
Since there’s no way to convert .pb file to uff base on Windows(as far as I know), I choose to use NvOnnxParser instead.
I have a model weight trained with Unet structure in keras, saved as .hdf5 file, thus I do the conversion to onnx format using keras2onnx
–> https://github.com/onnx/keras-onnx.

I’ve checked the converted onnx file’s Input and Output compares to the example one, and here’s the difference.

----- my weight -----            ||  ----- example weight(mnist.onnx) -----
Input Name: input_2              ||  Input Name: Input3
Input Shape: ['N', 512, 960, 3]  ||  Input Shape: [1, 1, 28, 28]
Output Name: conv2d_38		 ||  Output Name: Plus214_Output_0
Output Shape: ['N', 512, 960, 1] ||  Output Shape: [1, 10]

I found some example on github, and at my very first trials on the conversion(onnx to engine), it shows the error as below…

----------------------------------------------------------------
Input filename:   B235_preR_v8.onnx
ONNX IR version:  0.0.6
Opset version:    11
Producer name:    keras2onnx
Producer version: 1.6.0
Domain:           onnx
Model version:    0
Doc string:
----------------------------------------------------------------
[E] [TRT] Network has dynamic or shape inputs, but no optimization profile has been defined.
[E] [TRT] Network validation failed.
Engine Build Failure

After some research online, I added some code related to “optimization profile” from the example “sampleDynamicReshape” provided in the TensorRT zip file I downloaded.
But I also failed to do the conversion, and here’s the error code it shows

----------------------------------------------------------------
Input filename:   B235_preR_v8.onnx
ONNX IR version:  0.0.6
Opset version:    11
Producer name:    keras2onnx
Producer version: 1.6.0
Domain:           onnx
Model version:    0
Doc string:
----------------------------------------------------------------
[F] [TRT] Assertion failed: dims.d[i] >= 1
C:\source\builder\cudnnBuilderGraph.cpp:780
Aborting...

However, if I use the same code to convert mnist.onnx, it worked fine(?
(I didn’t notice I haven’t change the weight path during the revision of my conversion code then, so the Optimization shape is for my weight not the example one…)

----------------------------------------------------------------
Input filename:   mnist.onnx
ONNX IR version:  0.0.3
Opset version:    8
Producer name:    CNTK
Producer version: 2.5.1
Domain:           ai.cntk
Model version:    1
Doc string:
----------------------------------------------------------------
[W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[W] [TRT] TensorRT was linked against cuDNN 7.6.3 but loaded cuDNN 7.4.2
[W] [TRT] TensorRT was linked against cuDNN 7.6.3 but loaded cuDNN 7.4.2
[I] Profile dimensions in preprocessor engine:
    Minimum = (1, 3, 1, 1)
    Optimum = (1, 3, 512, 960)
    Maximum = (1, 3, 512, 960)
FINISH SAVING ENGINE
serialize model ready

I’ve checked the document of TensorRT7.0.0, but it shows that it can’t be done with this kind of format(or did I misunderstand…??

  1. The network definition must not have an implicit batch dimension
    –> https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html#work_dynamic_shapes
    So far I have no idea where did I do wrong…
    Any help or advice is appreciated!

(Below is my code which convert onnx to engine)

void onnxToTRTModel(const std::string& modelFile, // name of the onnx model
	unsigned int maxBatchSize,    // batch size - NB must be at least as large as the batch we want to run with
	IHostMemory*& trtModelStream,
	DataType dataType,
	IInt8Calibrator* calibrator,
	std::string save_name) // output buffer for the TensorRT model
{
	
	
	int verbosity = (int)nvinfer1::ILogger::Severity::kWARNING;
	// create the builder
	IBuilder* builder = createInferBuilder(gLogger.getTRTLogger());
	std::cout << "before config" << std::endl;
	IBuilderConfig* config = builder->createBuilderConfig();

	const auto explicitBatch = 1U << static_cast<uint32_t>(nvinfer1::NetworkDefinitionCreationFlag::kEXPLICIT_BATCH);
	nvinfer1::INetworkDefinition* network = builder->createNetworkV2(explicitBatch);

	auto parser = nvonnxparser::createParser(*network, gLogger);

	if (!parser->parseFromFile(modelFile.c_str(), verbosity))
	{
		std::string msg("failed to parse onnx file");
		gLogger.log(nvinfer1::ILogger::Severity::kERROR, msg.c_str());
		exit(EXIT_FAILURE);
	}
	if ((dataType == DataType::kINT8 && !builder->platformHasFastInt8()))
		exit(EXIT_FAILURE);  // if not supporting kint8 or khalf, return false

	builder->setMaxBatchSize(maxBatchSize);
	//cannot exceed the usage limitation of RAM(i.e. if 2080 provide 11G, it will prompt error if exceed 11G)
	builder->setMaxWorkspaceSize(1 << 50); //4_GB
	builder->setInt8Mode(dataType == DataType::kINT8);
	builder->setInt8Calibrator(calibrator);
	
	assert(engine);


	//another trial from sample code
	auto mPredictionInputDims = network->getInput(0)->getDimensions();
	auto mPredictionOutputDims = network->getOutput(0)->getDimensions();

	// Create the preprocessor engine using a network that supports full dimensions (createNetworkV2).
	auto preprocessorNetwork = builder->createNetworkV2(1U << static_cast<uint32_t>(NetworkDefinitionCreationFlag::kEXPLICIT_BATCH));

	// Reshape a dynamically shaped input to the size expected by the model, (1, 1, 28, 28).
	auto input = preprocessorNetwork->addInput("input_2", nvinfer1::DataType::kFLOAT, Dims4{ 1, 3, -1, -1 });
	std::cout << "after preprocessorNetwork->addInput" << std::endl;
	auto resizeLayer = preprocessorNetwork->addResize(*input);
	resizeLayer->setOutputDimensions(mPredictionInputDims);
	preprocessorNetwork->markOutput(*resizeLayer->getOutput(0));

	// Finally, configure and build the preprocessor engine.
	auto preprocessorConfig = builder->createBuilderConfig();

	// Create an optimization profile so that we can specify a range of input dimensions.
	auto profile = builder->createOptimizationProfile();

	// This profile will be valid for all images whose size falls in the range of [(1, 1, 1, 1), (1, 1, 56, 56)]
	// but TensorRT will optimize for (1, 1, 28, 28)
	profile->setDimensions(input->getName(), OptProfileSelector::kMIN, Dims4{ 1, 3, 1, 1});
	profile->setDimensions(input->getName(), OptProfileSelector::kOPT, Dims4{ 1, 3, 512, 960});
	profile->setDimensions(input->getName(), OptProfileSelector::kMAX, Dims4{ 1, 3, 512, 960});
	std::cout << "after setDimensions" << std::endl;
	preprocessorConfig->addOptimizationProfile(profile);
	std::cout << "after addOptimizationProfile" << std::endl;
	auto mPreprocessorEngine = builder->buildEngineWithConfig(*preprocessorNetwork, *preprocessorConfig);
	//seems like can't go through this line >> config or network is wrong
	gLogInfo << "Profile dimensions in preprocessor engine:\n";
	gLogInfo << "    Minimum = " << mPreprocessorEngine->getProfileDimensions(0, 0, OptProfileSelector::kMIN) << '\n';
	gLogInfo << "    Optimum = " << mPreprocessorEngine->getProfileDimensions(0, 0, OptProfileSelector::kOPT) << '\n';
	gLogInfo << "    Maximum = " << mPreprocessorEngine->getProfileDimensions(0, 0, OptProfileSelector::kMAX) << std::endl;

	// serialize the engine, then close everything down
	trtModelStream = mPreprocessorEngine->serialize();
	gieModelStream.write((const char*)trtModelStream->data(), trtModelStream->size());
	std::ofstream SaveFile(save_name, std::ios::out | std::ios::binary);
	SaveFile.seekp(0, std::ios::beg);
	SaveFile << gieModelStream.rdbuf();
	gieModelStream.seekg(0, gieModelStream.beg);
	mPreprocessorEngine->destroy();
	std::cout << "FINISH SAVING ENGINE" << std::endl;
	
}

Hi,

Can you provide the model file and following information so we can better help?
Provide details on the platforms you are using:
o GPU type
o Nvidia driver version
o CUDA version
o CUDNN version
o Python version [if using python]
o Tensorflow and PyTorch version
o TensorRT version

Thanks

Hi,
Thanks for the super quick reply!!!
Yes, of course! Here is it:

o GPU type - RTX2080ti
o Nvidia driver version - 432.00
o CUDA version 10.0
o CUDNN version 7.6.5
o Python version [if using python] - 3.6.7 (the conversion of hdf5 to onnx)
o onnx version - 1.6.0
o Tensorflow and PyTorch version - Tensorflow 1.13.1 (no pytorch(?)
o TensorRT version - 7.0.0.11

I’ve upload my weight and code to https://drive.google.com/open?id=1kKIpZjXgBRRB9CJ-jXVkVcY0rnqXhBDB ,
Thanks for your help and reply! Hope can hear from you later.
Have a nice day :D

Hi,

Can you try to upgrade onnx opset version to opset 9 or higher and rerun the script?

Thanks

Hi,

Sorry I didn’t notice there’s a new reply!
I’m in another project recently, I’ll try that part later and update the result.
(I noticed that my opset version is 11(?) from the content I wrote in the question above…

Thanks for your reply and help again!