Running yolox with relu activation on dla gives me incorrect result

I try to run yolox inference on agx 32g device. if I use gpu with fp16, the engine can works perfectly. but if I run the engine on dla, the result is not correct.

I have tried with yolov5 wih leaky relu to run on the dla, the result is ok.

I use onnx to generate the tensorrt engine file

My Env:
jetpack 4.4
tensorrt 7.1.3
cuda 10.2
input shape: 1x832x480x3

P.S. I add transpose and div255 operation in the graph.

Hi,

Would you mind upgrade your software to the latest and try it again?
Thanks.

Sorry for the late replay. I have tried the jetpack 4.6.1

this time I failed to generate engine that run on dla core. the following error showed up during engine file generating process

DEBUG: --------------- Timing Runner: {ForeignNode[Concat_195...Conv_279]} (DLA)
Module_id 33 Severity 2 : NVMEDIA_DLA 684
Module_id 33 Severity 2 : Failed to bind input tensor. err : 0x00000b
Module_id 33 Severity 2 : NVMEDIA_DLA 2866
Module_id 33 Severity 2 : Failed to bind input tensor args. status:  0x000007
DEBUG: Deleting timing cache: 164 entries, 10 hits
INFO: [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 1055, GPU 6417 (MiB)
ERROR: 1: [nvdlaUtils.cpp::submit::198] Error Code 1: DLA (Failure to submit program to DLA engine.)

Hi,

Based on the error, it seems that some issue in the implementation.
Could you try the model with trtexec to see if it works?

$ /usr/src/tensorrt/bin/trtexec --onnx=[/path/to/the/model] --useDLACore=0

Thanks.

trtexec has worked. Maybe there are some misconfig in my builder. I will check.

Hi,

Just for your reference.
You can find the source of trtexec below:

Thanks.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.