Hi.
I’m going to inference VGG-16 caffemodel from following model zoo.
TensorRT2 fails to load VGG-16 model above.
On the other hand, TensorRT succeeds to load following GoogLeNet model.
The VGG-16 model could be loaded and inferenced with NVCaffe.
So I guess there’s something wrong with TensorRT caffe loader.
Can you give me any advise?
I’m using following inference code.
My code should be correct because GoogLeNet succeeds to inference.
// initialize TensorRT network optimizer
IBuilder* builder = createInferBuilder(logger_);
CHECK(builder);
// parser caffe model
INetworkDefinition* network = builder->createNetwork();
ICaffeParser* parser = createCaffeParser();
CHECK(network);
CHECK(parser);
const IBlobNameToTensor* blobNameToTensor = parser->parse(model_file.c_str(),
trained_file.c_str(),
*network,
DataType::kFLOAT);
CHECK(blobNameToTensor);
// mark output of network (caffe model doesn't have output info)
for (auto& s : outputs) {
std::cout << "[INFO] marking blob " << s << " as output." << std::endl;
CHECK(blobNameToTensor->find(s.c_str()));
network->markOutput(*blobNameToTensor->find(s.c_str()));
}
// build TensorRT engine
builder->setMaxBatchSize(1);
builder->setMaxWorkspaceSize(16 << 20);
ICudaEngine* engine = builder->buildCudaEngine(*network);
CHECK(engine);
// destroy used objects
network->destroy();
parser->destroy();
// serialize the engine and close TensorRT optimizer
IHostMemory* modelStream = engine->serialize();
engine->destroy();
builder->destroy();
shutdownProtobufLibrary();
I got NULL pointer for blobNameToTensor->find(s.c_str()) on line 20 with VGG-16 model.
Also I got segmentation fault on builder->buildCudaEngine(*network); on line 28.
I also tried giexec.
giexec with VGG failed to find output blob.
nvidia@tegra-ubuntu:~/oss/tensorrt_samples/bin$ ./giexec --deploy=$HOME/VGG_ILSVRC_16_layers_deploy.prototxt.txt --output=prob
deploy: /home/nvidia/VGG_ILSVRC_16_layers_deploy.prototxt.txt
output: prob
Input "data": 3x224x224
could not find output blob prob
Engine could not be created
Engine could not be created
giexec with GoogLeNet succeeds.
nvidia@tegra-ubuntu:~/oss/tensorrt_samples/bin$ ./giexec --deploy=/home/nvidia/oss/tensorrt_samples/bin/googlenet_org/googlenet.prototxt --output=prob
deploy: /home/nvidia/oss/tensorrt_samples/bin/googlenet_org/googlenet.prototxt
output: prob
Input "data": 3x224x224
Output "prob": 1000x1x1
name=data, bindingIndex=0, buffers.size()=2
name=prob, bindingIndex=1, buffers.size()=2
Average over 10 runs is 32.6127 ms.
Average over 10 runs is 16.0368 ms.
Average over 10 runs is 16.0486 ms.
Average over 10 runs is 16.0474 ms.
Average over 10 runs is 16.1648 ms.
Average over 10 runs is 16.1458 ms.
Average over 10 runs is 16.0479 ms.
Average over 10 runs is 16.1553 ms.
Average over 10 runs is 16.0746 ms.
Average over 10 runs is 16.0705 ms.
I examined prototxt of VGG-16 but I couldn’t find what’s wrong with it.
Can you give me any advise?