I am currently attempting to build a cuda engine from a network in ONXX format and need some help. The network contains 3d convolutions, 2d convolutions, residual connections, dropout, and elu activations. It is about 16MB serialized. It executes in Tensorflow and exports to ONNX format without issue.
However when I call
buildEngineWithConfig() I am encountering an error. Graph construction and optimization completes successfully, but after a few minutes of autotuning the program crashes with this error:
terminate called after throwing an instance of 'pwgen::PwgenException' what(): Driver error:
There is nothing else in the logs except normal timing info showing the fastest tactics.
Here is the code I’m using to build the engine:
nvinfer1::IBuilder* nvbuilder = nvinfer1::createInferBuilder(logger); nvinfer1::INetworkDefinition* nvnetwork = nvbuilder->createNetworkV2(1U << (int)nvinfer1::NetworkDefinitionCreationFlag::kEXPLICIT_BATCH); nvbuilder->setMaxBatchSize(2); nvonnxparser::IParser* nvparser = nvonnxparser::createParser(*nvnetwork, logger); nvparser->parseFromFile(onnxFilename.c_str(), 0); nvinfer1::IBuilderConfig* config = nvbuilder->createBuilderConfig(); config->setFlags(1U << (int)nvinfer1::BuilderFlag::kFP16 | 1U << (int)nvinfer1::BuilderFlag::kDISABLE_TIMING_CACHE); config->setMaxWorkspaceSize(1 << 30); config->setProfilingVerbosity(nvinfer1::ProfilingVerbosity::kVERBOSE); nvinfer1::ICudaEngine* nvengine = nvbuilder->buildEngineWithConfig(*nvnetwork, *config);
I have tried various workspace sizes, toggling fp16/fp32, and various network architectures. Some architectures build but most are failing with the above error.
Am I doing something wrong? Am I maybe missing a kernel module? How can I further debug the driver error?
Thanks for any help.
- Jetson Nano Developer Kit (945-13450-0000-100)
- JetPack 4.4
- TensorFlow 2.2.0
- tf2onnx 1.6.3 (opset 8)