DLA_STANDALONE error in forceToUseNvmIO

Jhappy · October 19, 2022, 3:16am

Hi， I‘m trying to generate a DLA loadable alone for Resnet18.I get this error when I set config.engine_capability = trt.EngineCapability.DLA_STANDALONE:

[TRT] [E] 2: [nvmRegionOptimizer.cpp::forceToUseNvmIO::175] Error Code 2: Internal Error (Assertion std::all_of(a->consumers.begin(), a->consumers.end(), (Node* n) { return isDLA(n->backend); }) failed. )

Version：
Xavier NX
Jetpack 5.0.4

kayccc · October 19, 2022, 3:35am

It should be JetPack 5.0.2, right?

Jhappy · October 19, 2022, 5:14am

Yes,JetPack 5.0.2. I get it wrong

SivaRamaKrishnaNV · October 19, 2022, 10:58am

Dear @Jhappy,
Have you tried using trtexec to get DLA engine?

Jhappy · October 20, 2022, 1:09am

Yes, I sucess to get a DLA trt plan with trtexec.
I alse tried polygraphy tool :

success : polygraphy convert -v resnet18_cut_sim.onnx --int8 -o temp.engine --use-dla
failed : polygraphy convert -v resnet18_cut_sim.onnx --int8 -o temp.engine --use-dla --engine-capability DLA_STANDALONE

Jhappy · October 24, 2022, 2:21am

Dear @SivaRamaKrishnaNV .
Do you get any idea of this error log?

AastaLLL · November 17, 2022, 6:30am

Hi

We don’t have --engine-capability DLA_STANDALONE configure for v8.4.1 polygraphy.
Have you used the configuration on other platforms before?

$ polygraphy convert -h
usage: polygraphy convert [-h] [-v] [-q] [--silent] [--log-format {timestamp,line-info,no-colors} [{timestamp,line-info,no-colors} ...]] [--log-file LOG_FILE]
                          [--model-type {frozen,keras,ckpt,onnx,engine,uff,trt-network-script,caffe}] [--input-shapes INPUT_SHAPES [INPUT_SHAPES ...]] [--ckpt CKPT]
                          [--tf-outputs TF_OUTPUTS [TF_OUTPUTS ...]] [--freeze-graph] [--opset OPSET] [--no-const-folding] [--shape-inference] [--external-data-dir EXTERNAL_DATA_DIR]
                          [--onnx-outputs ONNX_OUTPUTS [ONNX_OUTPUTS ...]] [--onnx-exclude-outputs ONNX_EXCLUDE_OUTPUTS [ONNX_EXCLUDE_OUTPUTS ...]] [--save-external-data [EXTERNAL_DATA_PATH]]
                          [--external-data-size-threshold EXTERNAL_DATA_SIZE_THRESHOLD] [--no-save-all-tensors-to-one-file] [--seed SEED] [--val-range VAL_RANGE [VAL_RANGE ...]] [--int-min INT_MIN]
                          [--int-max INT_MAX] [--float-min FLOAT_MIN] [--float-max FLOAT_MAX] [--iterations NUM] [--load-inputs LOAD_INPUTS_PATHS [LOAD_INPUTS_PATHS ...] | --data-loader-script
                          DATA_LOADER_SCRIPT] [--data-loader-func-name DATA_LOADER_FUNC_NAME] [--trt-min-shapes TRT_MIN_SHAPES [TRT_MIN_SHAPES ...]] [--trt-opt-shapes TRT_OPT_SHAPES [TRT_OPT_SHAPES ...]]
                          [--trt-max-shapes TRT_MAX_SHAPES [TRT_MAX_SHAPES ...]] [--tf32] [--fp16] [--int8] [--precision-constraints {prefer,obey,none} | --obey-precision-constraints | --strict-types]
                          [--sparse-weights] [--workspace BYTES] [--calibration-cache CALIBRATION_CACHE] [--calib-base-cls CALIBRATION_BASE_CLASS] [--quantile QUANTILE]
                          [--regression-cutoff REGRESSION_CUTOFF] [--timing-cache TIMING_CACHE] [--load-timing-cache LOAD_TIMING_CACHE] [--save-tactics SAVE_TACTICS | --load-tactics LOAD_TACTICS]
                          [--tactic-sources [TACTIC_SOURCES [TACTIC_SOURCES ...]]] [--trt-config-script TRT_CONFIG_SCRIPT] [--trt-config-func-name TRT_CONFIG_FUNC_NAME] [--trt-safety-restricted]
                          [--use-dla] [--allow-gpu-fallback] [--pool-limit [MEMORY_POOL_LIMIT [MEMORY_POOL_LIMIT ...]]] [--plugins PLUGINS [PLUGINS ...]] [--explicit-precision]
                          [--trt-outputs TRT_OUTPUTS [TRT_OUTPUTS ...]] [--trt-exclude-outputs TRT_EXCLUDE_OUTPUTS [TRT_EXCLUDE_OUTPUTS ...]] [--trt-network-func-name TRT_NETWORK_FUNC_NAME]
                          [--save-timing-cache SAVE_TIMING_CACHE] -o OUTPUT [--convert-to {onnx,trt,onnx-like-trt-network}] [--fp-to-fp16]
                          model_file

Convert models to other formats.

optional arguments:
  -h, --help            show this help message and exit
  -o OUTPUT, --output OUTPUT
                        Path to save the converted model
  --convert-to {onnx,trt,onnx-like-trt-network}
                        The format to attempt to convert the model to.'onnx-like-trt-network' is EXPERIMETNAL and converts a TensorRT network to a format usable for visualization. See
                        'OnnxLikeFromNetwork' for details.

Logging:
  Options related to logging and debug output

  -v, --verbose         Increase logging verbosity. Specify multiple times for higher verbosity
  -q, --quiet           Decrease logging verbosity. Specify multiple times for lower verbosity
  --silent              Disable all output
  --log-format {timestamp,line-info,no-colors} [{timestamp,line-info,no-colors} ...]
                        Format for log messages: {{'timestamp': Include timestamp, 'line-info': Include file and line number, 'no-colors': Disable colors}}
  --log-file LOG_FILE   Path to a file where Polygraphy logging output should be written. This will not include logging output from dependencies, like TensorRT or ONNX-Runtime.

Model:
  Options related to the model

  model_file            Path to the model
  --model-type {frozen,keras,ckpt,onnx,engine,uff,trt-network-script,caffe}
                        The type of the input model: {{'frozen': TensorFlow frozen graph, 'keras': Keras model, 'ckpt': TensorFlow checkpoint directory, 'onnx': ONNX model, 'engine': TensorRT engine,
                        'trt-network-script': A Python script that defines a `load_network` function that takes no arguments and returns a TensorRT Builder, Network, and optionally Parser, 'uff': UFF
                        file [deprecated], 'caffe': Caffe prototxt [deprecated]}}
  --input-shapes INPUT_SHAPES [INPUT_SHAPES ...], --inputs INPUT_SHAPES [INPUT_SHAPES ...]
                        Model input(s) and their shape(s). Used to determine shapes to use while generating input data for inference. Format: --input-shapes-shapes <name>:<shape>. For example: --input-
                        shapes-shapes image:[1,3,224,224] other_input:[10]

TensorFlow Model Loading:
  Options related to loading TensorFlow models.

  --ckpt CKPT           [EXPERIMENTAL] Name of the checkpoint to load. Required if the `checkpoint` file is missing. Should not include file extension (e.g. to load `model.meta` use `--ckpt=model`)
  --tf-outputs TF_OUTPUTS [TF_OUTPUTS ...]
                        Name(s) of TensorFlow output(s). Using '--tf-outputs mark all' indicates that all tensors should be used as outputs
  --freeze-graph        [EXPERIMENTAL] Attempt to freeze the graph

TensorFlow-ONNX Model Conversion:
  Options related to converting TensorFlow models to ONNX.

  --opset OPSET         Opset to use when converting to ONNX
  --no-const-folding    [DEPRECATED] Do not fold constants in the TensorFlow graph prior to conversion

ONNX Shape Inference:
  Options related to ONNX shape inference.

  --shape-inference     Enable ONNX shape inference when loading the model

ONNX Model Loading:
  Options related to loading ONNX models.

  --external-data-dir EXTERNAL_DATA_DIR, --load-external-data EXTERNAL_DATA_DIR, --ext EXTERNAL_DATA_DIR
                        Path to a directory containing external data for the model. Generally, this is only required if the external data is not stored in the model directory.
  --onnx-outputs ONNX_OUTPUTS [ONNX_OUTPUTS ...]
                        Name(s) of ONNX tensor(s) to mark as output(s). Using the special value 'mark all' indicates that all tensors should be used as outputs
  --onnx-exclude-outputs ONNX_EXCLUDE_OUTPUTS [ONNX_EXCLUDE_OUTPUTS ...]
                        [EXPERIMENTAL] Name(s) of ONNX output(s) to unmark as outputs.
  --fp-to-fp16          Convert all floating point tensors in an ONNX model to 16-bit precision. This is *not* needed in order to use TensorRT's fp16 precision, but may be useful for other backends.
                        Requires onnxmltools.

ONNX Model Saving:
  Options related to saving ONNX models.

  --save-external-data [EXTERNAL_DATA_PATH], --external-data-path [EXTERNAL_DATA_PATH]
                        Whether to save weight data in external file(s). To use a non-default path, supply the desired path as an argument. This is always a relative path; external data is always written
                        to the same directory as the model.
  --external-data-size-threshold EXTERNAL_DATA_SIZE_THRESHOLD
                        The size threshold, in bytes, above which tensor data will be stored in the external file. Tensors smaller that this threshold will remain in the ONNX file. Optionally, use a `K`,
                        `M`, or `G` suffix to indicate KiB, MiB, or GiB respectively. For example, `--external-data-size-threshold=16M` is equivalent to `--external-data-size-threshold=16777216`. Has no
                        effect if `--save-external-data` is not set.
  --no-save-all-tensors-to-one-file
                        Do not save all tensors to a single file when saving external data. Has no effect if `--save-external-data` is not set

Data Loader:
  Options related to loading or generating input data for inference.

  --seed SEED           Seed to use for random inputs
  --val-range VAL_RANGE [VAL_RANGE ...]
                        Range of values to generate in the data loader. To specify per-input ranges, use the format: --val-range <input_name>:[min,max]. If no input name is provided, the range is used
                        for any inputs not explicitly specified. For example: --val-range [0,1] inp0:[2,50] inp1:[3.0,4.6]
  --int-min INT_MIN     [DEPRECATED: Use --val-range] Minimum integer value for random integer inputs
  --int-max INT_MAX     [DEPRECATED: Use --val-range] Maximum integer value for random integer inputs
  --float-min FLOAT_MIN
                        [DEPRECATED: Use --val-range] Minimum float value for random float inputs
  --float-max FLOAT_MAX
                        [DEPRECATED: Use --val-range] Maximum float value for random float inputs
  --iterations NUM, --iters NUM
                        Number of inference iterations for which the default data loader should supply data
  --load-inputs LOAD_INPUTS_PATHS [LOAD_INPUTS_PATHS ...], --load-input-data LOAD_INPUTS_PATHS [LOAD_INPUTS_PATHS ...]
                        [EXPERIMENTAL] Path(s) to load inputs. The file(s) should be a JSON-ified List[Dict[str, numpy.ndarray]], i.e. a list where each element is the feed_dict for a single iteration.
                        When this option is used, all other data loader arguments are ignored.
  --data-loader-script DATA_LOADER_SCRIPT
                        Path to a Python script that defines a function that loads input data. The function should take no arguments and return a generator or iterable that yields input data (Dict[str,
                        np.ndarray]). When this option is used, all other data loader arguments are ignored.
  --data-loader-func-name DATA_LOADER_FUNC_NAME
                        When using a data-loader-script, this specifies the name of the function that loads data. Defaults to `load_data`.

TensorRT Builder Configuration:
  Options related to creating the TensorRT BuilderConfig.

  --trt-min-shapes TRT_MIN_SHAPES [TRT_MIN_SHAPES ...]
                        The minimum shapes the optimization profile(s) will support. Specify this option once for each profile. If not provided, inference-time input shapes are used. Format: --trt-min-
                        shapes <input0>:[D0,D1,..,DN] .. <inputN>:[D0,D1,..,DN]
  --trt-opt-shapes TRT_OPT_SHAPES [TRT_OPT_SHAPES ...]
                        The shapes for which the optimization profile(s) will be most performant. Specify this option once for each profile. If not provided, inference-time input shapes are used. Format:
                        --trt-opt-shapes <input0>:[D0,D1,..,DN] .. <inputN>:[D0,D1,..,DN]
  --trt-max-shapes TRT_MAX_SHAPES [TRT_MAX_SHAPES ...]
                        The maximum shapes the optimization profile(s) will support. Specify this option once for each profile. If not provided, inference-time input shapes are used. Format: --trt-max-
                        shapes <input0>:[D0,D1,..,DN] .. <inputN>:[D0,D1,..,DN]
  --tf32                Enable tf32 precision in TensorRT
  --fp16                Enable fp16 precision in TensorRT
  --int8                Enable int8 precision in TensorRT. If calibration is required but no calibration cache is provided, this option will cause TensorRT to run int8 calibration using the Polygraphy
                        data loader to provide calibration data.
  --precision-constraints {prefer,obey,none}
                        If set to `prefer`, TensorRT will restrict available tactics to layer precisions specified in the network unless no implementation exists with the preferred layer constraints, in
                        which case it will issue a warning and use the fastest available implementation. If set to `obey`, TensorRT will instead fail to build the network if no implementation exists with
                        the preferred layer constraints. Defaults to `none`
  --obey-precision-constraints
                        [DEPRECATED - use --precision-constraints] Enable enforcing precision constraints in TensorRT, forcing it to use tactics based on the layer precision set, even if another
                        precision is faster. Build fails if such an engine cannot be built.
  --strict-types        [DEPRECATED - use --precision-constraints] Enable preference for precision constraints and avoidance of I/O reformatting in TensorRT, and fall back to ignoring the request if such
                        an engine cannot be built.
  --sparse-weights      Enable optimizations for sparse weights in TensorRT
  --workspace BYTES     [DEPRECATED - use --pool-limit] Amount of memory, in bytes, to allocate for the TensorRT builder's workspace. Optionally, use a `K`, `M`, or `G` suffix to indicate KiB, MiB, or
                        GiB respectively. For example, `--workspace=16M` is equivalent to `--workspace=16777216`.
  --calibration-cache CALIBRATION_CACHE
                        Path to load/save a calibration cache. Used to store calibration scales to speed up the process of int8 calibration. If the provided path does not yet exist, int8 calibration
                        scales will be calculated and written to it during engine building. If the provided path does exist, it will be read and int8 calibration will be skipped during engine building.
  --calib-base-cls CALIBRATION_BASE_CLASS, --calibration-base-class CALIBRATION_BASE_CLASS
                        The name of the calibration base class to use. For example, 'IInt8MinMaxCalibrator'.
  --quantile QUANTILE   The quantile to use for IInt8LegacyCalibrator. Has no effect for other calibrator types.
  --regression-cutoff REGRESSION_CUTOFF
                        The regression cutoff to use for IInt8LegacyCalibrator. Has no effect for other calibrator types.
  --timing-cache TIMING_CACHE
                        [DEPRECATED - use --load-timing-cache/--save-timing-cache] Path to load/save tactic timing cache. Used to cache tactic timing information to speed up the engine building process.
                        Existing caches will be appended to with any new timing information gathered.
  --load-timing-cache LOAD_TIMING_CACHE
                        Path to load tactic timing cache. Used to cache tactic timing information to speed up the engine building process.
  --save-tactics SAVE_TACTICS
                        Path to save a Polygraphy tactic replay file. Details about tactics selected by TensorRT will be recorded and stored at this location as a JSON file.
  --load-tactics LOAD_TACTICS
                        Path to load a Polygraphy tactic replay file, such as one created by --save-tactics. The tactics specified in the file will be used to override TensorRT's default selections.
  --tactic-sources [TACTIC_SOURCES [TACTIC_SOURCES ...]]
                        Tactic sources to enable. This controls which libraries (e.g. cudnn, cublas, etc.) TensorRT is allowed to load tactics from. Values come from the names of the values in the
                        trt.TacticSource enum and are case-insensitive. If no arguments are provided, e.g. '--tactic-sources', then all tactic sources are disabled.
  --trt-config-script TRT_CONFIG_SCRIPT
                        Path to a Python script that defines a function that creates a TensorRT IBuilderConfig. The function should take a builder and network as parameters and return a TensorRT builder
                        configuration. When this option is specified, all other config arguments are ignored.
  --trt-config-func-name TRT_CONFIG_FUNC_NAME
                        When using a trt-config-script, this specifies the name of the function that creates the config. Defaults to `load_config`.
  --trt-safety-restricted
                        Enable safety scope checking in TensorRT
  --use-dla             [EXPERIMENTAL] Use DLA as the default device type
  --allow-gpu-fallback  [EXPERIMENTAL] Allow layers unsupported on the DLA to fall back to GPU. Has no effect if --dla is not set.
  --pool-limit [MEMORY_POOL_LIMIT [MEMORY_POOL_LIMIT ...]], --memory-pool-limit [MEMORY_POOL_LIMIT [MEMORY_POOL_LIMIT ...]]
                        Set memory pool limits. Memory pool names come from the names of values in the trt.MemoryPoolType enum and are case-insensitiveFormat: `--pool-limit <pool_name>:<pool_limit> ...`.
                        For example, `--pool-limit dla_local_dram:1e9 workspace:16777216`. Optionally, use a `K`, `M`, or `G` suffix to indicate KiB, MiB, or GiB respectively. For example, `--pool-limit
                        workspace:16M` is equivalent to `--pool-limit workspace:16777216`.

TensorRT Plugin Loading:
  Options related to loading TensorRT plugins.

  --plugins PLUGINS [PLUGINS ...]
                        Path(s) of plugin libraries to load

TensorRT Network Loading:
  Options related to loading TensorRT networks.

  --explicit-precision  [DEPRECATED] Enable explicit precision mode
  --trt-outputs TRT_OUTPUTS [TRT_OUTPUTS ...]
                        Name(s) of TensorRT output(s). Using '--trt-outputs mark all' indicates that all tensors should be used as outputs
  --trt-exclude-outputs TRT_EXCLUDE_OUTPUTS [TRT_EXCLUDE_OUTPUTS ...]
                        [EXPERIMENTAL] Name(s) of TensorRT output(s) to unmark as outputs.
  --trt-network-func-name TRT_NETWORK_FUNC_NAME
                        When using a trt-network-script instead of other model types, this specifies the name of the function that loads the network. Defaults to `load_network`.

TensorRT Engine:
  Options related to loading TensorRT engines.

  --save-timing-cache SAVE_TIMING_CACHE
                        Path to save tactic timing cache if building an engine. Existing caches will be appended to with any new timing information gathered.

TensorRT Engine Saving:
  Options related to saving TensorRT engines.

Thanks.

Jhappy · November 17, 2022, 9:10am

Thanks for reply.
This is my polygraphy output. There is –engine-capability:

usage: polygraphy convert [-h] [-v] [-q] [--verbosity VERBOSITY [VERBOSITY ...]] [--silent]
                          [--log-format {timestamp,line-info,no-colors} [{timestamp,line-info,no-colors} ...]] [--log-file LOG_FILE]
                          [--model-type {frozen,keras,ckpt,onnx,engine,uff,trt-network-script,caffe}] [--input-shapes INPUT_SHAPES [INPUT_SHAPES ...]]
                          [--ckpt CKPT] [--tf-outputs TF_OUTPUTS [TF_OUTPUTS ...]] [--freeze-graph] [--opset OPSET] [--shape-inference]
                          [--no-onnxruntime-shape-inference] [--external-data-dir EXTERNAL_DATA_DIR] [--ignore-external-data]
                          [--onnx-outputs ONNX_OUTPUTS [ONNX_OUTPUTS ...]] [--onnx-exclude-outputs ONNX_EXCLUDE_OUTPUTS [ONNX_EXCLUDE_OUTPUTS ...]]
                          [--save-external-data [EXTERNAL_DATA_PATH]] [--external-data-size-threshold EXTERNAL_DATA_SIZE_THRESHOLD]
                          [--no-save-all-tensors-to-one-file] [--seed SEED] [--val-range VAL_RANGE [VAL_RANGE ...]] [--int-min INT_MIN] [--int-max INT_MAX]
                          [--float-min FLOAT_MIN] [--float-max FLOAT_MAX] [--iterations NUM] [--load-inputs LOAD_INPUTS_PATHS [LOAD_INPUTS_PATHS ...] |
                          --data-loader-script DATA_LOADER_SCRIPT] [--data-loader-func-name DATA_LOADER_FUNC_NAME]
                          [--trt-min-shapes TRT_MIN_SHAPES [TRT_MIN_SHAPES ...]] [--trt-opt-shapes TRT_OPT_SHAPES [TRT_OPT_SHAPES ...]]
                          [--trt-max-shapes TRT_MAX_SHAPES [TRT_MAX_SHAPES ...]] [--tf32] [--fp16] [--int8]
                          [--precision-constraints {prefer,obey,none} | --strict-types] [--sparse-weights] [--workspace BYTES]
                          [--calibration-cache CALIBRATION_CACHE] [--calib-base-cls CALIBRATION_BASE_CLASS] [--quantile QUANTILE]
                          [--regression-cutoff REGRESSION_CUTOFF] [--timing-cache TIMING_CACHE] [--load-timing-cache LOAD_TIMING_CACHE]
                          [--save-tactics SAVE_TACTICS | --load-tactics LOAD_TACTICS] [--tactic-sources [TACTIC_SOURCES [TACTIC_SOURCES ...]]]
                          [--trt-config-script TRT_CONFIG_SCRIPT] [--trt-config-func-name TRT_CONFIG_FUNC_NAME] [--trt-safety-restricted] [--refittable]
                          [--use-dla] [--allow-gpu-fallback] [--pool-limit MEMORY_POOL_LIMIT [MEMORY_POOL_LIMIT ...]]
                          [--preview-features PREVIEW_FEATURES [PREVIEW_FEATURES ...]] [--engine-capability ENGINE_CAPABILITY] [--direct-io]
                          [--plugins PLUGINS [PLUGINS ...]] [--trt-outputs TRT_OUTPUTS [TRT_OUTPUTS ...]]
                          [--trt-exclude-outputs TRT_EXCLUDE_OUTPUTS [TRT_EXCLUDE_OUTPUTS ...]] [--layer-precisions LAYER_PRECISIONS [LAYER_PRECISIONS ...]]
                          [--tensor-dtypes TENSOR_DTYPES [TENSOR_DTYPES ...]] [--tensor-formats TENSOR_FORMATS [TENSOR_FORMATS ...]]
                          [--trt-network-func-name TRT_NETWORK_FUNC_NAME] [--save-timing-cache SAVE_TIMING_CACHE] -o OUTPUT
                          [--convert-to {onnx,trt,onnx-like-trt-network}] [--fp-to-fp16]
                          model_file

The output version is:

$ polygraphy --version
Polygraphy | Version: 0.42.2

Do you mean tensorrt v8.4.1 have no --engine-capability DLA_STANDALONE` configure ?

AastaLLL · November 18, 2022, 2:58am

Hi,

Which TensorRT OSS branch do you use?
We test the 8.4.1 for compatibility of JetPack 5.0.2 TensorRT.

Thanks.

Jhappy · November 18, 2022, 5:19am

Hi~
I clone TensorRT OSS from master branch at commit 7d2a70a18bb0e5…
Thanks~

AastaLLL · November 21, 2022, 5:40am

Hi,

Since JetPack includes TensorRT 8.4, please use the same branches instead.
For a newer version(ex. v8.5), please wait for our future release.

Thanks.

Jhappy · November 22, 2022, 5:23am

Thanks .I will try that.
In the other hand ,I try to use python api to convert DLA_STANDALONE by

config.engine_capability = trt.EngineCapability.DLA_STANDALONE

I meet the same error.
Do you have any example file for that?I cannot find in the document.

SivaRamaKrishnaNV · November 24, 2022, 4:53pm

It seems you want to generate only DLA compatible model(no layer should be offloaded to GPU). The Error indicate, your model has non DLA compatible layers. Could you share the used trtexec command and trtexec output log for confirmation?

Jhappy · December 1, 2022, 9:55am

Sorry for my late reply. These days I cannot access the jetson at company.So I can not upload the log. In my model I cut all layer in resnet18 but the 1st convolution.

Yes I want to generate only DLA compatible model(no layer should be offloaded to GPU).I succeed use trtexec to get DLA trt plan(with allow fallback to gpu).
But I failed using polygraphy and PYTHON script togenerate only DLA compatible model. As @AastaLLL said maybe it’s caused by my version mismatch.I will try that after I can access the jetson box.

system · December 28, 2022, 2:03am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

ramc · February 9, 2023, 4:24pm

Also check out the DLA github page for samples and resources or to report issues: Recipes and tools for running deep learning workloads on NVIDIA DLA cores for inference applications.

We have a FAQ page that addresses some common questions that we see developers run into: Deep-Learning-Accelerator-SW/FAQ

Topic		Replies	Views
DLA performance DeepStream SDK	17	152	September 23, 2024
Process killed during tensorrt conversion on Jetson orin NX (8 GB) Jetson Orin NX tensorrt	15	741	April 30, 2024
Xavier NX does not support adaptative average pooling on DLA? Jetson Xavier NX tensorrt	27	1138	October 11, 2023
Engine creation fails when using DLA with GPU fallback Jetson AGX Xavier tensorrt , dla	11	1994	March 22, 2022
Cannot build a TensorRT engine for DLA from a large ONNX file Jetson Xavier NX tensorrt , nvbugs , dla	12	2625	July 21, 2021
Tensorrt Python API has a bug in DLA usage Jetson AGX Xavier tensorrt	11	634	August 17, 2022
TensorRT model inference fully on DLA is slow due to abnormally slow cudaEventSynchronize time Jetson AGX Orin tensorrt , cuda , dla	10	1546	January 17, 2024
Using dla on orin nx meet an error Jetson AGX Xavier dla	9	143	September 8, 2024
Model onnx trt engine generation process report different results compared between PC and jetson XAVIER NX Jetson Xavier NX tensorrt	19	1023	September 28, 2022
How can I customize matrix multiplication on DLA Jetson AGX Orin dla	12	203	September 25, 2024

DLA_STANDALONE error in forceToUseNvmIO

Related topics