Python API TensorRT equivalent calls

m.bingham · September 3, 2020, 7:24pm

When working with python, I am not finding some of the calls and functionality I see available in the C++ API.
For example, under “/usr/tensorrt/samples/common/sampleEngines.cpp” I see

TrtUniquePtr runtime{createInferRuntime(gLogger.getTRTLogger())};
runtime->setDLACore(DLACore);

In the python runtime tensorrt.Runtime
I dont see any calls to setDLACore. Only in IBuilderConfig do I see how to put a model on a DLA, but correct me if I am wrong, isnt this class for building a TensorRT model from another framework, like onnx? How would I load an existing TensortRT model and run it on the DLA like the above C++ code does?

Similarly, in the C++ api I see IBuilderConfig::setMaxWorkspaceSize()
The following builder.max_workspace_size in python seems to have no affect
trt.Builder(TRT_LOGGER) as builder
builder.max_workspace_size = 1 << 25 #This doesnt seem to do anything
builder_cfg = builder.create_builder_config()
engine = builder.build_engine(network, builder_cfg)

To elaborate, building with builder.build_cuda_engine(network) works fine but using the builder config results in [TensorRT] ERROR: Try increasing the workspace …

Thank you.

AakankshaS · September 4, 2020, 7:15pm

Hi @m.bingham

Please refer to the below link for python API’s supporting DLA
https://docs.nvidia.com/deeplearning/tensorrt/api/python_api/infer/Core/NetworkConfig.html#tensorrt.IBuilderConfig
Alternatively you can use trtexec command to set the DLA flags and workspace size
https://github.com/NVIDIA/TensorRT/tree/master/samples/opensource/trtexec

Thanks!