We want to use GPU+DLA. How do I use DLA when converting onnx to trt model? Is there a python sample

We want to use GPU+DLA. How do I use DLA when converting onnx to trt model? Is there a python sample

Hi,

Please find an example in the below topic:

Thanks.

This is my code to convert the onnx model to the trt model. Is the part of the DLA code configured correctly? Do I need to configure DLA-related information for subsequent inference work? How do I know that DLA is working?

def build_engine():
    """Takes an ONNX file and creates a TensorRT engine to run inference with"""
    EXPLICIT_BATCH = 1 << (int)(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)
    with trt.Builder(TRT_LOGGER) as builder, \
            builder.create_network(EXPLICIT_BATCH) as network, \
            trt.OnnxParser(network, TRT_LOGGER) as parser, \
            builder.create_builder_config() as config:
        builder.max_workspace_size = 1 << 28  
        builder.max_batch_size = 1
        #导入到DLA
        config.max_workspace_size = 1 << 28
        config.set_flag(trt.BuilderFlag.GPU_FALLBACK)
        config.set_flag(trt.BuilderFlag.FP16)
        config.default_device_type = trt.DeviceType.DLA
        config.DLA_core = 0
        if builder.platform_has_fast_fp16:
            builder.fp16_mode = True
        print('Loading ONNX file from path {}...'.format(onnx_file_path))
        with open(onnx_file_path, 'rb') as model:
            print('Beginning ONNX file parsing')
            if not parser.parse(model.read()):
                print("errors:")
                for error in range(parser.num_errors):
                    print(parser.get_error(error))
        print('Completed parsing of ONNX file')
        engine = builder.build_cuda_engine(network)
        print("Completed creating Engine")
        with open(engine_file_path, "wb") as f:
            f.write(engine.serialize())
        return engine

Hi,

You can check this through the TensorRT log.

For example:

TRT_LOGGER = trt.Logger(trt.Logger.INFO)

def build_engine():
    """Takes an ONNX file and creates a TensorRT engine to run inference with"""
    EXPLICIT_BATCH = 1 << (int)(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)
    with trt.Builder(TRT_LOGGER) as builder, \
    ...

Then it is expected to see the detailed layer placement information like below:


[07/12/2021-13:54:22] [I] [TRT] --------------- Layers running on DLA:
[07/12/2021-13:54:22] [I] [TRT] {Convolution28}, {ReLU32,Pooling66,Convolution110}, {ReLU114,Pooling160}, {Plus214},
[07/12/2021-13:54:22] [I] [TRT] --------------- Layers running on GPU:
[07/12/2021-13:54:22] [I] [TRT] (Unnamed Layer* 0) [Constant] + Times212_reshape1, (Unnamed Layer* 16) [Constant] + shuffle_(Unnamed Layer* 16) [Constant]_output, (Unnamed Layer* 3) [Constant] + (Unnamed Layer* 4) [Shuffle] + Plus30, (Unnamed Layer* 9) [Constant] + (Unnamed Layer* 10) [Shuffle] + Plus112, Times212_reshape0, Times212, shuffle_Times212_Output_0,

Thanks.

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.