tensorRT 5 on xavier and DLA core dumped (Solved)

Hi, I tried to test tensorRT with resnet50 to run on dla.
Is it because some layers are not supported? Thanks.

./giexec --deploy=…/data/resnet/ResNet-50-deploy.prototxt --output=prob --fp16 --engine=resnet50.tensorcache --useDLA=1 --allowGPUFallback

I got this error output:
deploy: …/data/resnet/ResNet-50-deploy.prototxt
output: prob
fp16
engine: resnet50.tensorcache
useDLA: 1
allowGPUFallback
Input “data”: 3x224x224
Output “prob”: 1000x1x1
Default DLA is enabled but layer prob is not running on DLA, falling back to GPU.
name=data, bindingIndex=0, buffers.size()=2
name=prob, bindingIndex=1, buffers.size()=2
dla/eglUtils.cpp (497) - Cuda Error in cudaGfxRes2EngineTensor: 4
dla/eglUtils.cpp (253) - Cuda Error in releaseFrame: 4
terminate called after throwing an instance of ‘nvinfer1::CudaError’
what(): std::exception
./giexec: line 20: 6471 Aborted (core dumped) TRTEXEC_FPATH "@"

also when I return to GPU by useDLA = 0, it also gave me error like:

deploy: …/data/resnet/ResNet-50-deploy.prototxt
output: prob
fp16
engine: resnet50.tensorcache
useDLA: 0
allowGPUFallback
Input “data”: 3x224x224
Output “prob”: 1000x1x1
name=data, bindingIndex=0, buffers.size()=2
name=prob, bindingIndex=1, buffers.size()=2
cuda/cudaSoftMaxLayer.cpp (110) - Cudnn Error in execute: 8
cuda/cudaSoftMaxLayer.cpp (110) - Cudnn Error in execute: 8
CUDA cask failure at execution for trt_volta_scudnn_128x64_relu_medium_nn_v1.
cuda/caskConvolutionLayer.cpp (241) - Cuda Error in execute: 4
cuda/caskConvolutionLayer.cpp (241) - Cuda Error in execute: 4
CUDA cask failure at execution for trt_volta_scudnn_128x64_relu_medium_nn_v1.
cuda/caskConvolutionLayer.cpp (241) - Cuda Error in execute: 4
cuda/caskConvolutionLayer.cpp (241) - Cuda Error in execute: 4
CUDA cask failure at execution for trt_volta_scudnn_128x64_relu_medium_nn_v1.
cuda/caskConvolutionLayer.cpp (241) - Cuda Error in execute: 4
cuda/caskConvolutionLayer.cpp (241) - Cuda Error in execute: 4
CUDA cask failure at execution for trt_volta_scudnn_128x64_relu_medium_nn_v1.
cuda/caskConvolutionLayer.cpp (241) - Cuda Error in execute: 4
cuda/caskConvolutionLayer.cpp (241) - Cuda Error in execute: 4
CUDA cask failure at execution for trt_volta_scudnn_128x64_relu_medium_nn_v1.
cuda/caskConvolutionLayer.cpp (241) - Cuda Error in execute: 4
cuda/caskConvolutionLayer.cpp (241) - Cuda Error in execute: 4
CUDA cask failure at execution for trt_volta_scudnn_128x64_relu_medium_nn_v1.
cuda/caskConvolutionLayer.cpp (241) - Cuda Error in execute: 4
cuda/caskConvolutionLayer.cpp (241) - Cuda Error in execute: 4
CUDA cask failure at execution for trt_volta_scudnn_128x64_relu_medium_nn_v1.
cuda/caskConvolutionLayer.cpp (241) - Cuda Error in execute: 4
cuda/caskConvolutionLayer.cpp (241) - Cuda Error in execute: 4
CUDA cask failure at execution for trt_volta_scudnn_128x64_relu_medium_nn_v1.
cuda/caskConvolutionLayer.cpp (241) - Cuda Error in execute: 4
cuda/caskConvolutionLayer.cpp (241) - Cuda Error in execute: 4
Average over 10 runs is 2.70531 ms (host walltime is 3.23131 ms, 99% percentile time is 27.0531).
CUDA cask failure at execution for trt_volta_scudnn_128x64_relu_medium_nn_v1.
cuda/caskConvolutionLayer.cpp (241) - Cuda Error in execute: 4
cuda/caskConvolutionLayer.cpp (241) - Cuda Error in execute: 4
CUDA cask failure at execution for trt_volta_scudnn_128x64_relu_medium_nn_v1.
cuda/caskConvolutionLayer.cpp (241) - Cuda Error in execute: 4
cuda/caskConvolutionLayer.cpp (241) - Cuda Error in execute: 4
CUDA cask failure at execution for trt_volta_scudnn_128x64_relu_medium_nn_v1.
cuda/caskConvolutionLayer.cpp (241) - Cuda Error in execute: 4
cuda/caskConvolutionLayer.cpp (241) - Cuda Error in execute: 4
CUDA cask failure at execution for trt_volta_scudnn_128x64_relu_medium_nn_v1.
cuda/caskConvolutionLayer.cpp (241) - Cuda Error in execute: 4
cuda/caskConvolutionLayer.cpp (241) - Cuda Error in execute: 4

Hi,

Thanks for your post.
We are checking this issue and will update the information with you later.

Suppose you are using this prototxt file, is it correct?
https://github.com/KaimingHe/deep-residual-networks/blob/master/prototxt/ResNet-50-deploy.prototxt

Thanks.

Yes I’m using that one. THanks.

Hi,

We can successfully run ResNet-50-deploy.prototxt with DLA on JetPack4.1.
Could you flash your device with our new JetPack and try it again?
https://developer.nvidia.com/embedded/jetpack-notes

By the way, with allowGPUFallback flag, TensorRT will automatically use GPU to inference a DLA non-supported layer.
Thanks.