Please provide the following info (check/uncheck the boxes after creating this topic):

**Software Version**

DRIVE OS Linux 5.1.6

**Target Operating System**

Custom Debian Linux

**Hardware Platform**

NVIDIA DRIVE™ AGX Xavier DevKit (E3550)

**SDK Manager Version**

1.8.0.10363

## Description:

Running the sample works with out utilizing DLA-s. Passing `--useDLACore=0 `

does not work and will end with an error message:

NVMEDIA_DLA : 528, ERROR: load from memory failed.

[E] [TRT] dla/dlaUtils.cpp (171) - DLA Error in deserialize: 7 (Failure to load program.)

[E] [TRT] dla/dlaUtils.cpp (171) - DLA Error in deserialize: 7 (Failure to load program.)

## Environment

**TensorRT Version**: 5.1.5

**NVIDIA GPU**: NVIDIA Volta™-class integrated GPU

**CUDA Version**: 10.1

**CUDNN Version**: 7.5.1

## Relevant Files

Samples are from NVIDIA-s official tarball (version 5.1) (https://developer.nvidia.com/nvidia-tensorrt-5x-download) with relevant data.

## Steps To Reproduce

./sample_onnx_mnist --datadir=/root/tensorrt_sample/mnist_data --int8 --useDLACore=0

## &&&& RUNNING TensorRT.sample_onnx_mnist # ./sample_onnx_mnist --datadir=/root/tensorrt_sample/mnist_data --int8 --useDLACore=0

[I] Building and running a GPU inference engine for Onnx MNIST

## Input filename: /root/tensorrt_sample/mnist_data/mnist.onnx

ONNX IR version: 0.0.3

Opset version: 1

Producer name: CNTK

Producer version: 2.4

Domain:

Model version: 1

Doc string:[I] [TRT] Parameter193:Constant → (16, 4, 4, 10)

[I] [TRT] Parameter193_reshape1:Reshape → (256, 10)

[I] [TRT] Parameter6:Constant → (8)

[I] [TRT] Parameter5:Constant → (8, 1, 5, 5)

[I] [TRT] Convolution28_Output_0:Conv → (8, 28, 28)

[I] [TRT] Plus30_Output_0:Add → (8, 28, 28)

[I] [TRT] ReLU32_Output_0:Relu → (8, 28, 28)

[I] [TRT] Pooling66_Output_0:MaxPool → (8, 14, 14)

[I] [TRT] Parameter87:Constant → (16, 8, 5, 5)

[I] [TRT] Convolution110_Output_0:Conv → (16, 14, 14)

[I] [TRT] Parameter88:Constant → (16)

[I] [TRT] Plus112_Output_0:Add → (16, 14, 14)

[I] [TRT] ReLU114_Output_0:Relu → (16, 14, 14)

[I] [TRT] Pooling160_Output_0:MaxPool → (16, 4, 4)

[I] [TRT] Pooling160_Output_0_reshape0:Reshape → (256)

[I] [TRT] Times212_Output_0:MatMul → (10)

[I] [TRT] Parameter194:Constant → (1, 10)

[I] [TRT] Plus214_Output_0:Add → (10)

----- Parsing of ONNX model /root/tensorrt_sample/mnist_data/mnist.onnx is Done ----

[I] [TRT] Setting dynamic range for Input3 to [-127,127]

[I] [TRT] Setting dynamic range for Convolution28_Output_0 to [-127,127]

[I] [TRT] Setting dynamic range for (Unnamed Layer* 1) [Constant]_output to [-127,127]

[I] [TRT] Setting dynamic range for Plus30_Output_0 to [-127,127]

[I] [TRT] Setting dynamic range for ReLU32_Output_0 to [-127,127]

[I] [TRT] Setting dynamic range for Pooling66_Output_0 to [-127,127]

[I] [TRT] Setting dynamic range for Convolution110_Output_0 to [-127,127]

[I] [TRT] Setting dynamic range for (Unnamed Layer* 6) [Constant]_output to [-127,127]

[I] [TRT] Setting dynamic range for Plus112_Output_0 to [-127,127]

[I] [TRT] Setting dynamic range for ReLU114_Output_0 to [-127,127]

[I] [TRT] Setting dynamic range for Pooling160_Output_0 to [-127,127]

[I] [TRT] Setting dynamic range for Pooling160_Output_0_reshape0 to [-127,127]

[I] [TRT] Setting dynamic range for (Unnamed Layer* 11) [Constant]_output to [-127,127]

[I] [TRT] Setting dynamic range for Times212_Output_0 to [-127,127]

[I] [TRT] Setting dynamic range for (Unnamed Layer* 13) [Constant]_output to [-127,127]

[I] [TRT] Setting dynamic range for Plus214_Output_0 to [-127,127]

[W] [TRT] Default DLA is enabled but layer (Unnamed Layer* 1) [Constant] is not running on DLA, falling back to GPU.

[W] [TRT] Default DLA is enabled but layer (Unnamed Layer* 6) [Constant] is not running on DLA, falling back to GPU.

[W] [TRT] Default DLA is enabled but layer (Unnamed Layer* 10) [Shuffle] is not running on DLA, falling back to GPU.

[W] [TRT] Default DLA is enabled but layer (Unnamed Layer* 11) [Constant] is not running on DLA, falling back to GPU.

[W] [TRT] Default DLA is enabled but layer (Unnamed Layer* 12) [Matrix Multiply] is not running on DLA, falling back to GPU.

[W] [TRT] Default DLA is enabled but layer (Unnamed Layer* 13) [Constant] is not running on DLA, falling back to GPU.

[I] [TRT]

[I] [TRT] --------------- Layers running on DLA:

[I] [TRT] (Unnamed Layer* 0) [Convolution], (Unnamed Layer* 2) [ElementWise], (Unnamed Layer* 3) [Activation], (Unnamed Layer* 4) [Pooling], (Unnamed Layer* 5) [Convolution], (Unnamed Layer* 7) [ElementWise], (Unnamed Layer* 8) [Activation], (Unnamed Layer* 9) [Pooling], (Unnamed Layer* 14) [ElementWise],

[I] [TRT] --------------- Layers running on GPU:

[I] [TRT] (Unnamed Layer* 1) [Constant], (Unnamed Layer* 6) [Constant], (Unnamed Layer* 10) [Shuffle], (Unnamed Layer* 11) [Constant], (Unnamed Layer* 12) [Matrix Multiply], (Unnamed Layer* 13) [Constant],

[W] [TRT] Calibrator is not being used. Users must provide dynamic range for all tensors that are not Int32.

[I] [TRT] [INT8 Quantization] User overriding Scales: Input3 [1]

[I] [TRT] [INT8 Quantization] User overriding Scales: Convolution28_Output_0 [1]

[I] [TRT] [INT8 Quantization] User overriding Scales: (Unnamed Layer* 1) [Constant]_output [1]

[I] [TRT] [INT8 Quantization] User overriding Scales: Plus30_Output_0 [1]

[I] [TRT] [INT8 Quantization] User overriding Scales: ReLU32_Output_0 [1]

[I] [TRT] [INT8 Quantization] User overriding Scales: Pooling66_Output_0 [1]

[I] [TRT] [INT8 Quantization] User overriding Scales: Convolution110_Output_0 [1]

[I] [TRT] [INT8 Quantization] User overriding Scales: (Unnamed Layer* 6) [Constant]_output [1]

[I] [TRT] [INT8 Quantization] User overriding Scales: Plus112_Output_0 [1]

[I] [TRT] [INT8 Quantization] User overriding Scales: ReLU114_Output_0 [1]

[I] [TRT] [INT8 Quantization] User overriding Scales: Pooling160_Output_0 [1]

[I] [TRT] [INT8 Quantization] User overriding Scales: Pooling160_Output_0_reshape0 [1]

[I] [TRT] [INT8 Quantization] User overriding Scales: (Unnamed Layer* 11) [Constant]_output [1]

[I] [TRT] [INT8 Quantization] User overriding Scales: Times212_Output_0 [1]

[I] [TRT] [INT8 Quantization] User overriding Scales: (Unnamed Layer* 13) [Constant]_output [1]

[I] [TRT] [INT8 Quantization] User overriding Scales: Plus214_Output_0 [1]

[I] [TRT] [INT8 Quantization] INT8 Inference Tensor Scales: Input3 [1]

[I] [TRT] [INT8 Quantization] INT8 Inference Tensor Scales: Convolution28_Output_0 [1]

[I] [TRT] [INT8 Quantization] INT8 Inference Tensor Scales: (Unnamed Layer* 1) [Constant]_output [1]

[I] [TRT] [INT8 Quantization] INT8 Inference Tensor Scales: Plus30_Output_0 [1]

[I] [TRT] [INT8 Quantization] INT8 Inference Tensor Scales: ReLU32_Output_0 [1]

[I] [TRT] [INT8 Quantization] INT8 Inference Tensor Scales: Pooling66_Output_0 [1]

[I] [TRT] [INT8 Quantization] INT8 Inference Tensor Scales: Convolution110_Output_0 [1]

[I] [TRT] [INT8 Quantization] INT8 Inference Tensor Scales: (Unnamed Layer* 6) [Constant]_output [1]

[I] [TRT] [INT8 Quantization] INT8 Inference Tensor Scales: Plus112_Output_0 [1]

[I] [TRT] [INT8 Quantization] INT8 Inference Tensor Scales: ReLU114_Output_0 [1]

[I] [TRT] [INT8 Quantization] INT8 Inference Tensor Scales: Pooling160_Output_0 [1]

[I] [TRT] [INT8 Quantization] INT8 Inference Tensor Scales: Pooling160_Output_0_reshape0 [1]

[I] [TRT] [INT8 Quantization] INT8 Inference Tensor Scales: (Unnamed Layer* 11) [Constant]_output [1]

[I] [TRT] [INT8 Quantization] INT8 Inference Tensor Scales: Times212_Output_0 [1]

[I] [TRT] [INT8 Quantization] INT8 Inference Tensor Scales: (Unnamed Layer* 13) [Constant]_output [1]

[I] [TRT] [INT8 Quantization] INT8 Inference Tensor Scales: Plus214_Output_0 [1]

[I] [TRT] Original: 15 layers

[I] [TRT] After dead-layer removal: 15 layers

[I] [TRT] After DLA optimization: 13 layers

[I] [TRT] After scale fusion: 13 layers

[I] [TRT] After vertical fusions: 13 layers

[I] [TRT] After swap: 13 layers

[I] [TRT] After final dead-layer removal: 13 layers

[I] [TRT] After tensor merging: 13 layers

[I] [TRT] After concat removal: 13 layers

[I] [TRT] Configuring builder for Int8 Mode completed in 0.0084503 seconds.

[I] [TRT] Graph construction and optimization completed in 0.00888536 seconds.

[W] [TRT] Warning: no implementation of (Unnamed Layer* 1) [Constant] obeys the requested constraints, using a higher precision type

[I] [TRT]

[I] [TRT] --------------- Timing (9)

[I] [TRT] Tactic 0 time 0.006912

[I] [TRT]

[I] [TRT] --------------- Timing (9)

[I] [TRT] Tactic 0 time 0.006976

[W] [TRT] Warning: no implementation of (Unnamed Layer* 6) [Constant] obeys the requested constraints, using a higher precision type

[I] [TRT]

[I] [TRT] --------------- Timing (9)

[I] [TRT] Tactic 0 time 0.00544

[I] [TRT]

[I] [TRT] --------------- Timing (9)

[I] [TRT] Tactic 0 time 0.00736

[W] [TRT] Warning: no implementation of (Unnamed Layer* 11) [Constant] obeys the requested constraints, using a higher precision type

[W] [TRT] Warning: no implementation of (Unnamed Layer* 13) [Constant] obeys the requested constraints, using a higher precision type

[I] [TRT]

[I] [TRT] --------------- Timing Input3 to nvm(9)

[I] [TRT] Tactic 0 time 0.006944

[I] [TRT]

[I] [TRT] --------------- Timing {(Unnamed Layer* 0) [Convolution]}(31)

[I] [TRT] Tactic 548859524883 is the only option, timing skipped

[I] [TRT]

[I] [TRT] --------------- Timing (9)

[I] [TRT] Tactic 0 time 0.008832

[I] [TRT]

[I] [TRT] --------------- Timing (9)

[I] [TRT] Tactic 0 time 0.00736

[I] [TRT]

[I] [TRT] --------------- Timing (9)

[I] [TRT] Tactic 0 time 0.007168

[I] [TRT]

[I] [TRT] --------------- Timing (9)

[I] [TRT] Tactic 0 time 0.011136

[I] [TRT]

[I] [TRT] --------------- Timing (9)

[I] [TRT] Tactic 0 time 0.00688

[I] [TRT]

[I] [TRT] --------------- Timing (9)

[I] [TRT] Tactic 0 time 0.007232

[I] [TRT]

[I] [TRT] --------------- Timing (9)

[I] [TRT] Tactic 0 time 0.008832

[I] [TRT]

[I] [TRT] --------------- Timing (9)

[I] [TRT] Tactic 0 time 0.007136

[I] [TRT]

[I] [TRT] --------------- Timing (9)

[I] [TRT] Tactic 0 time 0.007232

[I] [TRT]

[I] [TRT] --------------- Timing (9)

[I] [TRT] Tactic 0 time 0.007136

[I] [TRT]

[I] [TRT] --------------- Timing (9)

[I] [TRT] Tactic 0 time 0.005248

[I] [TRT]

[I] [TRT] --------------- Timing (9)

[I] [TRT] Tactic 0 time 0.00512

[I] [TRT]

[I] [TRT] --------------- Timing (9)

[I] [TRT] Tactic 0 time 0.008832

[I] [TRT]

[I] [TRT] --------------- Timing (9)

[I] [TRT] Tactic 0 time 0.007136

[I] [TRT]

[I] [TRT] --------------- Timing (Unnamed Layer* 2) ElementWise

[I] [TRT] Tactic 1 time 0.009536

[I] [TRT] Tactic 2 time 0.01232

[I] [TRT]

[I] [TRT] --------------- Timing (Unnamed Layer* 2) ElementWise

[I] [TRT] Tactic 1 time 0.008896

[I] [TRT]

[I] [TRT] --------------- Timing (Unnamed Layer* 2) ElementWise

[I] [TRT] Tactic 1 time 0.009152

[I] [TRT]

[I] [TRT] --------------- Timing (9)

[I] [TRT] Tactic 0 time 0.007008

[I] [TRT]

[I] [TRT] --------------- Timing (9)

[I] [TRT] Tactic 0 time 0.0104

[I] [TRT]

[I] [TRT] --------------- Timing (9)

[I] [TRT] Tactic 0 time 0.007936

[I] [TRT]

[I] [TRT] --------------- Timing (9)

[I] [TRT] Tactic 0 time 0.006976

[I] [TRT]

[I] [TRT] --------------- Timing (9)

[I] [TRT] Tactic 0 time 0.008768

[I] [TRT]

[I] [TRT] --------------- Timing (9)

[I] [TRT] Tactic 0 time 0.006944

[I] [TRT]

[I] [TRT] --------------- Timing (9)

[I] [TRT] Tactic 0 time 0.0112

[I] [TRT]

[I] [TRT] --------------- Timing (9)

[I] [TRT] Tactic 0 time 0.007008

[I] [TRT]

[I] [TRT] --------------- Timing {(Unnamed Layer* 3) [Activation],(Unnamed Layer* 4) [Pooling],(Unnamed Layer* 5) [Convolution]}(31)

[I] [TRT] Tactic 548859524883 is the only option, timing skipped

[I] [TRT]

[I] [TRT] --------------- Timing (9)

[I] [TRT] Tactic 0 time 0.009344

[I] [TRT]

[I] [TRT] --------------- Timing (9)

[I] [TRT] Tactic 0 time 0.006944

[I] [TRT]

[I] [TRT] --------------- Timing (9)

[I] [TRT] Tactic 0 time 0.0072

[I] [TRT]

[I] [TRT] --------------- Timing (9)

[I] [TRT] Tactic 0 time 0.00864

[I] [TRT]

[I] [TRT] --------------- Timing (9)

[I] [TRT] Tactic 0 time 0.00688

[I] [TRT]

[I] [TRT] --------------- Timing (9)

[I] [TRT] Tactic 0 time 0.005216

[I] [TRT]

[I] [TRT] --------------- Timing (9)

[I] [TRT] Tactic 0 time 0.008608

[I] [TRT]

[I] [TRT] --------------- Timing (9)

[I] [TRT] Tactic 0 time 0.007136

[I] [TRT]

[I] [TRT] --------------- Timing (9)

[I] [TRT] Tactic 0 time 0.006944

[I] [TRT]

[I] [TRT] --------------- Timing (9)

[I] [TRT] Tactic 0 time 0.007488

[I] [TRT]

[I] [TRT] --------------- Timing (9)

[I] [TRT] Tactic 0 time 0.005344

[I] [TRT]

[I] [TRT] --------------- Timing (9)

[I] [TRT] Tactic 0 time 0.005408

[I] [TRT]

[I] [TRT] --------------- Timing (9)

[I] [TRT] Tactic 0 time 0.009088

[I] [TRT]

[I] [TRT] --------------- Timing (9)

[I] [TRT] Tactic 0 time 0.005248

[I] [TRT]

[I] [TRT] --------------- Timing (Unnamed Layer* 7) ElementWise

[I] [TRT] Tactic 1 time 0.009216

[I] [TRT] Tactic 2 time 0.009184

[I] [TRT]

[I] [TRT] --------------- Timing (Unnamed Layer* 7) ElementWise

[I] [TRT] Tactic 1 time 0.008608

[I] [TRT]

[I] [TRT] --------------- Timing (Unnamed Layer* 7) ElementWise

[I] [TRT] Tactic 1 time 0.00864

[I] [TRT]

[I] [TRT] --------------- Timing (9)

[I] [TRT] Tactic 0 time 0.006944

[I] [TRT]

[I] [TRT] --------------- Timing (9)

[I] [TRT] Tactic 0 time 0.009344

[I] [TRT]

[I] [TRT] --------------- Timing (9)

[I] [TRT] Tactic 0 time 0.007264

[I] [TRT]

[I] [TRT] --------------- Timing (9)

[I] [TRT] Tactic 0 time 0.007232

[I] [TRT]

[I] [TRT] --------------- Timing (9)

[I] [TRT] Tactic 0 time 0.009216

[I] [TRT]

[I] [TRT] --------------- Timing (9)

[I] [TRT] Tactic 0 time 0.005344

[I] [TRT]

[I] [TRT] --------------- Timing (9)

[I] [TRT] Tactic 0 time 0.008896

[I] [TRT]

[I] [TRT] --------------- Timing (9)

[I] [TRT] Tactic 0 time 0.006976

[I] [TRT]

[I] [TRT] --------------- Timing {(Unnamed Layer* 8) [Activation],(Unnamed Layer* 9) [Pooling]}(31)

[I] [TRT] Tactic 548859524883 is the only option, timing skipped

[I] [TRT]

[I] [TRT] --------------- Timing (9)

[I] [TRT] Tactic 0 time 0.009152

[I] [TRT]

[I] [TRT] --------------- Timing (9)

[I] [TRT] Tactic 0 time 0.0088

[I] [TRT]

[I] [TRT] --------------- Timing (Unnamed Layer* 10) Shuffle

[I] [TRT] Tactic 0 is the only option, timing skipped

[W] [TRT] Warning: no implementation of (Unnamed Layer* 10) [Shuffle] obeys the requested constraints, using a higher precision type

[I] [TRT]

[I] [TRT] --------------- Timing (Unnamed Layer* 12) Matrix Multiply

[I] [TRT] Tactic 0 is the only option, timing skipped

[W] [TRT] Warning: no implementation of (Unnamed Layer* 12) [Matrix Multiply] obeys the requested constraints, using a higher precision type

[I] [TRT]

[I] [TRT] --------------- Timing (Unnamed Layer* 14) ElementWise

[I] [TRT] Tactic 1 is the only option, timing skipped

[W] [TRT] Warning: no implementation of (Unnamed Layer* 14) [ElementWise] obeys the requested constraints, using a higher precision type

[I] [TRT] Adding reformat layer: (Unnamed Layer* 1) [Constant] output to be reformatted 0 ((Unnamed Layer* 1) [Constant]_output) from Int8(1,1,1:32,1) to Float(1,1,1,8)

[I] [TRT] Adding reformat layer: (Unnamed Layer* 6) [Constant] output to be reformatted 0 ((Unnamed Layer* 6) [Constant]_output) from Int8(1,1,1:32,1) to Float(1,1,1,16)

[I] [TRT] Adding reformat layer: (Unnamed Layer* 10) [Shuffle] reformatted input 0 (Pooling160_Output_0) from Int8(1,4,16:32,16) to Float(1,4,16,256)

[I] [TRT] Formats and tactics selection completed in 3.93059 seconds.

[I] [TRT] After reformat layers: 22 layers

[I] [TRT] Block size 16777216

[I] [TRT] Block size 25088

[I] [TRT] Block size 1024

[I] [TRT] Block size 512

[I] [TRT] Total Activation Memory: 16803840

[I] [TRT] Detected 1 input and 1 output network tensors.

NVMEDIA_DLA : 528, ERROR: load from memory failed.

[E] [TRT] dla/dlaUtils.cpp (171) - DLA Error in deserialize: 7 (Failure to load program.)

[E] [TRT] dla/dlaUtils.cpp (171) - DLA Error in deserialize: 7 (Failure to load program.)