Accessing Jetson's DLA from python

m.bingham · August 26, 2020, 8:35pm

Description

Using python, I am using an onnx network on a Jetson NX. I have things running on the GPU but I would like to try to get them to run on the DLA. I see C++ examples on the subject but not python. Could someone provide some guidance? My hope was that tensorrt.Builder.canRunOnDLA() would be available.

TensorRT Version: 7.1.0.16
Python Version (if applicable): 3.6.

AakankshaS · August 27, 2020, 6:00am

Hi @m.bingham,
I am afraid we do not have any example available publicly.
But you can check the below link
https://docs.nvidia.com/deeplearning/tensorrt/api/python_api/infer/Core/NetworkConfig.html
All these flags are available in IBuilderConfig.

Thanks!

m.bingham · August 31, 2020, 8:43pm

I am struggling getting things to run on the DLA with python. Do I need to export the Tensort Rt model with something specific for the DLA other then 16 bit floating points? There are no complaints from the python interpreter but I still see GPU use and at /sys/devices/platform/host1x/15880000.nvdla0/power/runtime_status showing in active. Here is an excerpt of what I am doing.

import tensorrt as trt

def allocate_buffers(engine, batch_size, data_type):
h_input_1 = cuda.pagelocked_empty(batch_size * trt.volume(engine.get_binding_shape(0)), dtype=trt.nptype(data_type))
h_output = cuda.pagelocked_empty(batch_size * trt.volume(engine.get_binding_shape(1)), dtype=trt.nptype(data_type))
# Allocate device memory for inputs and outputs.
d_input_1 = cuda.mem_alloc(h_input_1.nbytes)
d_output = cuda.mem_alloc(h_output.nbytes)
# Create a stream in which to copy inputs/outputs and run inference.
stream = cuda.Stream()
return h_input_1, d_input_1, h_output, d_output, stream

def do_inference(engine, pics_1, h_input_1, d_input_1, h_output, d_output, stream, batch_size, height, width):

   load_images_to_buffer(pics_1, h_input_1)

   with engine.create_execution_context() as context:
       # Transfer input data to the GPU.
       cuda.memcpy_htod_async(d_input_1, h_input_1, stream)

       # Run inference.
       context.profiler = trt.Profiler()
       context.execute(batch_size=1, bindings=[int(d_input_1), int(d_output)])

       # Transfer predictions back from the GPU.
       cuda.memcpy_dtoh_async(h_output, d_output, stream)
       # Synchronize the stream
       stream.synchronize()
       # Return the host output.
       out = h_output.reshape((batch_size,-1, height, width))
       return out



TRT_LOGGER = trt.Logger(trt.Logger.WARNING)
trt_runtime = trt.Runtime(TRT_LOGGER)

#Move to the DLA
trt.BuilderFlag.GPU_FALLBACK = False
trt.IBuilderConfig.default_device_type = trt.DeviceType.DLA
trt.IBuilderConfig.DLA_core = 0

with open(plan_path, 'rb') as f:
 	engine_data = f.read()

engine = trt_runtime.deserialize_cuda_engine(engine_data)

h_input, d_input, h_output, d_output, stream = allocate_buffers(engine, 1, trt.float32)

raw_result = do_inference(engine, pil_img, h_input, d_input, h_output, d_output, stream, 1, 1000, 1)

AakankshaS · December 1, 2020, 6:39am

Hi @m.bingham,
Please make sure all layer justify following conditions and try setting all gpu fallback to True

Thanks!

Topic		Replies	Views
We want to use GPU+DLA. How do I use DLA when converting onnx to trt model? Is there a python sample Jetson Xavier NX jetson-inference	4	1065	September 19, 2021
Tensorrt Python API has a bug in DLA usage Jetson AGX Xavier tensorrt	11	626	August 17, 2022
Run a part of DNN on DLA and part of DNN on GPU Jetson AGX Xavier dla	7	1164	February 14, 2023
Cannot run model exported from TLT on Jetson's DLA TensorRT	2	325	December 16, 2020
How to make context on DLA? Jetson Xavier NX dla	6	602	November 27, 2023
General Question about jetson Xavier NX Jetson Xavier NX dla	15	1572	October 18, 2021
Cannot run model exported from TLT on Jetson's DLA TAO Toolkit tensorrt	7	444	October 12, 2021
[TensorRT] Running a simple onnx model on Jetson Xavier DLA Jetson Xavier NX tensorrt , onnx	12	2911	August 10, 2022
TensorRT run DLA on Xavier Jetson AGX Xavier nvbugs	11	1619	October 18, 2021
Jetson Orin: All layers pushed to GPU, zero layers on DLA Jetson AGX Orin tensorrt , dla	7	1026	April 26, 2023

Accessing Jetson's DLA from python

Description

Related topics