Skip Optimization during Development

johannes_dob · December 20, 2019, 5:46pm

When calling:

_engine = std::shared_ptr<nvinfer1::ICudaEngine>(builder->buildEngineWithConfig(*network, *config), InferDeleter());

It always takes quite some time to optimize the network (around 3 minutes).

Is there some config to skip the optimization part during development or at least speed it up for some performance loss. When I just want to try out stuff? Currently it’s quite annoying to test anything because for every code change I have to wait 3 minutes to run it.

I already tried to add:

config->setAvgTimingIterations(1);
config->setMinTimingIterations(0);

But it does not seem to change anything.

Using TensorRT v7.0.0

johannes_dob · December 22, 2019, 4:11pm

Turns out, it is expected that the engine creation can take quite a bit of runtime. That is why there is engine->serialize() method and the engine can be saved in memory or to disk. That way we only have to build the engine once per model and machine.
I ended up checking if the engine file exists on disk, if not → create it and save it to disk, if yes → load engine from disk. Here is some code:

std::ifstream file("/path/to/model.engine");
if(file.good()) {
  std::cout << "** Found Engine on Disk, loading... **" << std::endl;

  file.seekg(0, file.end);
  auto size = static_cast<size_t>(file.tellg());
  file.seekg(0, file.beg);
  std::vector<char> engineStream;
  engineStream.resize(size);
  file.read(engineStream.data(), size);
  file.close();

  nvinfer1::IRuntime* runtime = nvinfer1::createInferRuntime(getLogger());
  auto engine = runtime->deserializeCudaEngine(engineStream.data(), size);
  runtime->destroy();
}
else {
  std::cout << "** No Engine found on Disk, creating... **" << std::endl;
  // create engine here...
  nvinfer1::IHostMemory* serializedModel = engine->serialize();
  std::ofstream p("path/to/model.engine", std::ios::binary);
  p.write((const char*)serializedModel->data(),serializedModel->size());
  p.close();
  serializedModel->destroy();
}