Trtexec Segmentation fault with exportProfile and useDLAcore enabled on Xavier NX

Description

Hi,
I am getting a segmentation fault error on running trtexec with exportProfile, useDLAcore, and allowGPUFallback, FP16 options enable. It executes a few things and then I get the error.
Also, when I run the same model without useDLAcore and allowGPUFallback options there is no issue. And when I run with exportProfile disabled and useDLAcore and allowGPUFallback enabled. I don’t get segmentation fault.

Environment

TensorRT Version: 7.1.30
GPU Type: Xavier NX
Nvidia Driver Version: L4T 32.4.3
CUDA Version: 10.2.89
CUDNN Version: 8.0.0.180
Jetpack version: 4.4

Please include:

  • Full traceback of errors encountered
    [10/13/2020-18:37:51] [V] [TRT] Layer(FinishNvmRegion): {BatchNormalization_153,Relu_154} output to be reformatted 0 finish, Tactic: 0, {BatchNormalization_153,Relu_154} output to be reformatted 0[Half(256,60,80)] ->
    [10/13/2020-18:37:51] [V] [TRT] Layer(gemmDeconvolution): ConvTranspose_155, Tactic: 0, 487[Half(256,60,80)] -> 488[Half(128,120,160)]
    [10/13/2020-18:37:51] [V] [TRT] Layer(Reformat): {BatchNormalization_156,Relu_157,Conv_158} input reformatter 0, Tactic: 1002, 488[Half(128,120,160)] -> {BatchNormalization_156,Relu_157,Conv_158} reformatted input 0[Half(128,120,160)]
    [10/13/2020-18:37:51] [V] [TRT] Layer(DLANative): {BatchNormalization_156,Relu_157,Conv_158}, Tactic: 549275190019, {BatchNormalization_156,Relu_157,Conv_158} reformatted input 0[Half(128,120,160)] -> output copy[Half(21,120,160)]
    [10/13/2020-18:37:51] [V] [TRT] Layer(FinishNvmRegion): {BatchNormalization_156,Relu_157,Conv_158} reformatted input 0 finish, Tactic: 0, {BatchNormalization_156,Relu_157,Conv_158} reformatted input 0[Half(128,120,160)] ->
    [10/13/2020-18:37:51] [V] [TRT] Layer(Reformat): output from nvm, Tactic: 1002, output copy[Half(21,120,160)] -> output[Float(21,120,160)]
    [10/13/2020-18:37:51] [V] [TRT] Layer(FinishNvmRegion): output copy finish, Tactic: 0, output copy[Half(21,120,160)] ->
    [10/13/2020-18:37:51] [I] Starting inference threads
    [10/13/2020-18:37:54] [I] Warmup completed 0 queries over 200 ms
    [10/13/2020-18:37:54] [I] Timing trace has 0 queries over 3.08449 s
    [10/13/2020-18:37:54] [I] Trace averages of 25000 runs:
    [10/13/2020-18:37:54] [I] Host Latency
    [10/13/2020-18:37:54] [I] min: 37.3453 ms (end to end 37.3556 ms)
    [10/13/2020-18:37:54] [I] max: 38.8215 ms (end to end 38.8329 ms)
    [10/13/2020-18:37:54] [I] mean: 37.5965 ms (end to end 37.6148 ms)
    [10/13/2020-18:37:54] [I] median: 37.4478 ms (end to end 37.4585 ms)
    [10/13/2020-18:37:54] [I] percentile: 38.8215 ms at 99% (end to end 38.8329 ms at 99%)
    [10/13/2020-18:37:54] [I] throughput: 0 qps
    [10/13/2020-18:37:54] [I] walltime: 3.08449 s
    [10/13/2020-18:37:54] [I] Enqueue Time
    [10/13/2020-18:37:54] [I] min: 3.24341 ms
    [10/13/2020-18:37:54] [I] max: 13.7832 ms
    [10/13/2020-18:37:54] [I] median: 3.96033 ms
    [10/13/2020-18:37:54] [I] GPU Compute
    [10/13/2020-18:37:54] [I] min: 37.1077 ms
    [10/13/2020-18:37:54] [I] max: 38.5801 ms
    [10/13/2020-18:37:54] [I] mean: 37.3572 ms
    [10/13/2020-18:37:54] [I] median: 37.2079 ms
    [10/13/2020-18:37:54] [I] percentile: 38.5801 ms at 99%
    [10/13/2020-18:37:54] [I] total compute time: 3.06329 s
    Segmentation fault (core dumped)

Hi @prashantmaheshwari94,
Request you to share your ONNX model so that we can assist you better.

Thanks!

Hi Aakanksha,

I can not share the model as I am bind by the company agreement.

Could the error be related to memory issue?

Hi @prashantmaheshwari94,
Can you share the logs with --verbose tag in the trtexec command.

Thanks!

Hi @AakankshaS

PFA. the log generated from verbose.
fail_log.txt (290.7 KB)

Hi @prashantmaheshwari94,
Is this the complete log file?
I couldnt see any warning or Error there.
Can you please check.

Thanks!