No method in TensorRT Python API for setting DLA core for inference

jkjung13 · September 29, 2020, 2:02pm

Description

There is no method in TensorRT Python API for setting a particular DLA core for inference?

Environment

TensorRT Version: 7.1.3.4
GPU Type: Jetson Xavier NX
Nvidia Driver Version: JatPack-4.4
CUDA Version: 10.2
CUDNN Version: 8.0
Python Version (if applicable): 3.6
Baremetal or Container (if container which image + tag): baremetal

Steps To Reproduce

According to official documentation, there are TensorRT C++ API functions for checking whether DLA cores are available, as well as setting a particular DLA core for inference. However, there is no such functions in the Python API?

I tried the following with python3 on Jetson Xavier NX (TensorRT 7.1.3.4):

>>> import tensorrt as trt
>>> logger = trt.Logger(trt.Logger.VERBOSE)
>>> runtime = trt.Runtime(logger)
>>> dir(runtime)
['__class__', '__del__', '__delattr__', '__dir__', '__doc__', '__enter__', '__eq__', '__exit__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'deserialize_cuda_engine', 'gpu_allocator']
>>> with open('yolov3-dla0-608.trt', 'rb') as f:
...     engine = runtime.deserialize_cuda_engine(f.read())
...
[TensorRT] VERBOSE: Deserialize required 900134 microseconds.
>>> dir(engine)
['__class__', '__del__', '__delattr__', '__dir__', '__doc__', '__enter__', '__eq__', '__exit__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__len__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__',
'__subclasshook__', 'binding_is_input', 'create_execution_context', 'create_execution_context_without_device_memory', 'device_memory_size', 'get_binding_bytes_per_component', 'get_binding_components_per_element', 'get_binding_dtype', 'get_binding_format', 'get_binding_format_desc', 'get_binding_index', 'get_binding_name', 'get_binding_shape', 'get_binding_vectorized_dim', 'get_location', 'get_profile_shape', 'get_profile_shape_input', 'has_implicit_batch_dimension', 'is_execution_binding', 'is_shape_binding', 'max_batch_size', 'max_workspace_size', 'name', 'num_bindings', 'num_layers', 'num_optimization_profiles', 'refittable', 'serialize']
>>> context = engine.create_execution_context()
>>> dir(context)
['__class__', '__del__', '__delattr__', '__dir__', '__doc__', '__enter__', '__eq__', '__exit__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'active_optimization_profile', 'all_binding_shapes_specified', 'all_shape_inputs_specified', 'debug_sync', 'device_memory', 'engine', 'execute', 'execute_async', 'execute_async_v2', 'execute_v2', 'get_binding_shape', 'get_shape', 'get_strides', 'name', 'profiler', 'set_binding_shape', 'set_shape_input']
>>>

So neither of the “tensorrt.Runtime”, “tensorrt.ICudaEngine” or “tensorrt.IExecutionContext” classes provides any API for setting DLA core for inferencing (for a deserialzed TensorRT engine). How do I make sure the deserialzed TensorRT engine is running on a DLA core, or on DLA core #1 vs. #0??

SunilJB · September 30, 2020, 6:06am

Hi @jkjung13,

You can set DLA core in IBuilderConfig
https://docs.nvidia.com/deeplearning/tensorrt/api/python_api/infer/Core/NetworkConfig.html#tensorrt.IBuilderConfig
Check if the layer can run on DLA:
https://docs.nvidia.com/deeplearning/tensorrt/api/python_api/infer/Core/NetworkConfig.html?highlight=dla#tensorrt.IBuilderConfig.can_run_on_DLA

Thanks

jkjung13 · September 30, 2020, 7:11am

@SunilJB When deserializing a TensorRT engine from a file, we don’t create a Builder or BuilderConfig, do we??

SunilJB · September 30, 2020, 8:39am

I think what you are looking for is a python equivalent of below command.
Set the DLA engine to execute on:

engine->setDLACore(0)

But this is not available yet. You can use IBuildeConfig to set DLA while using Python API.

Thanks

jkjung13 · September 30, 2020, 9:06am

This does not help at inference time… (We don’t build TensorRT engines on the deployed products. We only use serialized engine.)

Could you consider adding the Python API in the next TensorRT release? Thanks.

fpsychosis · October 1, 2020, 8:10pm

Same opinion here, please add it

Topic		Replies	Views
Tensorrt Python API has a bug in DLA usage Jetson AGX Xavier tensorrt	11	680	August 17, 2022
When converting onnx to engine using trtexec, i set --useDLACore=1; But in inference, the DLA0 is active but DLA1 is suspend? Jetson Xavier NX dla	3	901	August 3, 2022
Python api support for dla inference on jetson xavier TensorRT	1	817	April 17, 2021
Python API TensorRT equivalent calls TensorRT tensorrt , python	1	391	September 4, 2020
Run on DLA cores in Python Tensorflow / TensorRT? Jetson AGX Xavier tensorrt , tensorflow , dla	2	728	October 18, 2021
Enable DLA Mode For A Layer During Network Creation Jetson Xavier NX dla	4	761	October 18, 2021
How to use DLA with Tesla T4 TensorRT	2	2954	January 29, 2019
Choosing --useDLACore=1 option dumping the core Jetson AGX Xavier dla	6	688	November 29, 2023
How to control the execution destination of TensorRT Engine? Jetson AGX Orin tensorrt , dla	3	412	May 17, 2023
Specify DLA core at build time and at runtime, why? TensorRT dla	2	877	October 12, 2021

No method in TensorRT Python API for setting DLA core for inference

Description

Environment

Steps To Reproduce

Related topics