When I am trying to create the engine file on DLA using below command
trtexec --onnx=./test_mul.onnx --explicitBatch --workspace=1024 --saveEngine=./test_mul_fp16.trt --verbose --fp16 --useDLACore=0 --allowGPUFallback
the multiplication layer in the onnx model is falling back to GPU instead of running on DLA with this warning DLA allows only same dimensions inputs to Elementwise
how can we make the multiplication layer in this model to run on DLA?
Environment
TensorRT Version: 7.1.3 GPU Type: xavier Nvidia Driver Version: Package:nvidia-jetpack, Version: 4.4 CUDA Version: 10.2 CUDNN Version: 8.0 Operating System + Version: Ubuntu 18.04 Python Version (if applicable): TensorFlow Version (if applicable): PyTorch Version (if applicable): Baremetal or Container (if container which image + tag):
Thanks, but my problem is that the multiplication operator was running fine on GPU, but when I am trying to run it on DLA, it’s automatically falling back to GPU. It’s saying that Pointwise multiplication with broadcast is not supported on DLA, Does this issue is fixed in the latest version of TensorRT or it’s still a limitation on DLA.
Okay, thanks for the reply. I have enabled that flag and it ran on GPU, but Nowhere in the DLA documentation, there is not mentioned this broadcasting issue right?