Parse OpenCV's face detector caffe model fail

Hi,
From OpenCV 3.3,its dnn module include a deep learning based face detector,I test it and found it’s accuracy is good,so I decite to migrate its caffe model to TensorRT for I can catch the best performance,but I failed again,the ICaffeParser::parse() report below errors:

ERROR: mbox_loc: all concat input tensors must have the same dimensions except on the concatenation axis
ERROR: mbox_conf: all concat input tensors must have the same dimensions except on the concatenation axis
Caffe Parser: Invalid axis in softmax layer - Cannot perform softmax along batch size dimension and expects NCHW input. Negative axis is not supported in TensorRT, please use positive axis indexing error parsing layer type Softmax index 109

for my poor deep learning theory knowledge,I can’t solve above errors,so does anyone know how to fix these errors?should I have to write a custom trt plugin?

the caffe prototxt and model file of face detector can be found here:computer_vision/CAFFE_DNN at master · gopinath-balu/computer_vision · GitHub

Hello

"ERROR: mbox_loc: all concat input tensors must have the same dimensions except on the concatenation axis
" The above message is a failure message, which indicates that the network cannot be built.
To help us debug, can you share a reproduction case that contains the model (computer_vision/CAFFE_DNN at master · gopinath-balu/computer_vision · GitHub )? and the tensorrt conversion code?

Hi,

the python code from computer_vision/CAFFE_DNN at master · gopinath-balu/computer_vision · GitHub can be run on my machine,it uses OpenCV’s dnn module to do inference, below is my trt conversion code:

BOOL CreateFromCaffeModel(
	const std::string& deployFile,                   // name for caffe prototxt
	const std::string& modelFile,                    // name for model
	unsigned int maxBatchSize,                       // batch size - NB must be at least as large as the batch we want to run with)
	BOOL isFP16,
	IHostMemory** trtModelStream,                    // output stream for the TensorRT model
	ICudaEngine ** engine)
{
	// create the builder
	IBuilder* builder = createInferBuilder(gLogger);
	INetworkDefinition * network = builder->createNetwork();

	// parse the caffe model to populate the network, then set the outputs
	ICaffeParser* parser = createCaffeParser();
	std::cout << "start to parse model \n" << endl;
	const IBlobNameToTensor * blobToTensor = parser->parse(
		deployFile.c_str(),
		modelFile.c_str(),
		*network,
		isFP16 ? (DataType::kHALF) : (DataType::kFLOAT)
	);
	assert(blobToTensor);
	if (!blobToTensor)
		return FALSE; 
	std::cout << "end parse model \n" << endl;

	// init builder
	builder->setMaxBatchSize(maxBatchSize);
	builder->setMaxWorkspaceSize(10 << 20); // we need about 6MB of scratch space for the plugin layer for batch size 5
	builder->setFp16Mode(isFP16 ? true : false);
	builder->setInt8Mode(false);

	// build engine
	std::cout << "start to build engine \n" << endl;
	(*engine) = builder->buildCudaEngine(*network);
	assert(*engine);
	if (!(*engine))
		return FALSE;
	std::cout << "end build engine \n" << endl;

	// we don't need the network any more, and we can destroy the parser
	network->destroy();
	parser->destroy();

	// clean
	builder->destroy();
	shutdownProtobufLibrary();
	return TRUE;
}

on my machine,parser->parse() return false;