Hi, I have installed it successfully.
Ubuntu 18.04 LTS aarch64
CUDA 10.0.326
cuDNN 7.5.0.66
TensorRT 5.1.6.1
Xavier
I run my caffe faster-rcnn model on DLA with TensorRT 5.1. I have set gpu fallback. I have set FP16.But the error occurs, how do I fix it?
[W] [TRT] Default DLA is enabled but layer proposal is not running on DLA, falling back to GPU.
...
[W] [TRT] Default DLA is enabled but layer cls_prob is not running on DLA, falling back to GPU.
[W] [TRT] DLA Node compilation Failed.
[E] [TRT] Internal error: could not find any implementation for node{conv1, relu1, conv2, relu2, ......, concat, convf, reluf, rpn_conv1}, try increasing the workspace size with IBuilder::setMaxWorkspaceSize()
[E] [TRT] ../builder/tacticOptimizer.cpp(1330) - OutOfMemory Error in computeCosts:0
...Assertion 'engine' failed.
Aborted(core dumped)
I have set batchsize=1, maxworkspacesize=1000<<21, my model is 160*160, but also the same error.
When I was in TensorRT 5.0, this code has the error same as the url below described.
../builder/cudnnBuilder2.cpp (728) - Misc Error in buildSingleLayer: 1 (Unable to process layer.)
../builder/cudnnBuilder2.cpp (728) - Misc Error in buildSingleLayer: 1 (Unable to process layer.)
It says that this problem will be fixed in TensorRT 5.1, but now error occurs as I descriped above.What causes it? How to fix it? Thanks.
my code
const int useFP = 16;
static int gUseDLACore{0};
void caffeToGIEModel(const std::string& deployFile,
const std::string& modelFile,
const std::vector<std::string>& outputs,
unsigned int maxBatchSize,
nvcaffeparser1::IPluginFactory* pluginFactory,
IHostMemory **gieModelStream)
{
IBuilder* builder = createInferBuilder(gLogger);
INetworkDefinition* network = builder->createNetwork();
ICaffeParser* parser = createCaffeParser();
parser->setPluginFactory(pluginFactory);
DataType dataType = DataType::kFLOAT; // default
if(useFP == 16)
{
if(builder->platformHasFastFp16())
{
dataType = DataType::kHALF;
}
else
{
return;
}
}
std::cout << "Begin parsing model..." << std::endl;
const IBlobNameToTensor* blobNameToTensor = parser->parse((deployFile).c_str(),
(modelFile).c_str(),
*network,
dataType);
std::cout << "End parsing model..." << std::endl;
for (auto& s : outputs)
network->markOutput(*blobNameToTensor->find(s.c_str()));
builder->setMaxBatchSize(maxBatchSize);
builder->setMaxWorkspaceSize(1000 << 21);
if(gUseDLACore >= 0)
{
samplesCommon::enableDLA(builder, gUseDLACore);
}
if(useFP == 16)
{
builder->setFp16Mode(true);
}
std::cout << "Begin building engine..." << std::endl;
ICudaEngine* engine = builder->buildCudaEngine(*network);
assert(engine);
std::cout << "End building engine..." << std::endl;
network->destroy();
parser->destroy();
(*gieModelStream) = engine->serialize();
engine->destroy();
builder->destroy();
shutdownProtobufLibrary();
}