Description
Hi,
I am getting a segmentation fault error on running trtexec with exportProfile, useDLAcore, and allowGPUFallback, FP16 options enable. It executes a few things and then I get the error.
Also, when I run the same model without useDLAcore and allowGPUFallback options there is no issue. And when I run with exportProfile disabled and useDLAcore and allowGPUFallback enabled. I don’t get segmentation fault.
Environment
TensorRT Version: 7.1.30
GPU Type: Xavier NX
Nvidia Driver Version: L4T 32.4.3
CUDA Version: 10.2.89
CUDNN Version: 8.0.0.180
Jetpack version: 4.4
Please include:
- Full traceback of errors encountered
[10/13/2020-18:37:51] [V] [TRT] Layer(FinishNvmRegion): {BatchNormalization_153,Relu_154} output to be reformatted 0 finish, Tactic: 0, {BatchNormalization_153,Relu_154} output to be reformatted 0[Half(256,60,80)] →
[10/13/2020-18:37:51] [V] [TRT] Layer(gemmDeconvolution): ConvTranspose_155, Tactic: 0, 487[Half(256,60,80)] → 488[Half(128,120,160)]
[10/13/2020-18:37:51] [V] [TRT] Layer(Reformat): {BatchNormalization_156,Relu_157,Conv_158} input reformatter 0, Tactic: 1002, 488[Half(128,120,160)] → {BatchNormalization_156,Relu_157,Conv_158} reformatted input 0[Half(128,120,160)]
[10/13/2020-18:37:51] [V] [TRT] Layer(DLANative): {BatchNormalization_156,Relu_157,Conv_158}, Tactic: 549275190019, {BatchNormalization_156,Relu_157,Conv_158} reformatted input 0[Half(128,120,160)] → output copy[Half(21,120,160)]
[10/13/2020-18:37:51] [V] [TRT] Layer(FinishNvmRegion): {BatchNormalization_156,Relu_157,Conv_158} reformatted input 0 finish, Tactic: 0, {BatchNormalization_156,Relu_157,Conv_158} reformatted input 0[Half(128,120,160)] →
[10/13/2020-18:37:51] [V] [TRT] Layer(Reformat): output from nvm, Tactic: 1002, output copy[Half(21,120,160)] → output[Float(21,120,160)]
[10/13/2020-18:37:51] [V] [TRT] Layer(FinishNvmRegion): output copy finish, Tactic: 0, output copy[Half(21,120,160)] →
[10/13/2020-18:37:51] [I] Starting inference threads
[10/13/2020-18:37:54] [I] Warmup completed 0 queries over 200 ms
[10/13/2020-18:37:54] [I] Timing trace has 0 queries over 3.08449 s
[10/13/2020-18:37:54] [I] Trace averages of 25000 runs:
[10/13/2020-18:37:54] [I] Host Latency
[10/13/2020-18:37:54] [I] min: 37.3453 ms (end to end 37.3556 ms)
[10/13/2020-18:37:54] [I] max: 38.8215 ms (end to end 38.8329 ms)
[10/13/2020-18:37:54] [I] mean: 37.5965 ms (end to end 37.6148 ms)
[10/13/2020-18:37:54] [I] median: 37.4478 ms (end to end 37.4585 ms)
[10/13/2020-18:37:54] [I] percentile: 38.8215 ms at 99% (end to end 38.8329 ms at 99%)
[10/13/2020-18:37:54] [I] throughput: 0 qps
[10/13/2020-18:37:54] [I] walltime: 3.08449 s
[10/13/2020-18:37:54] [I] Enqueue Time
[10/13/2020-18:37:54] [I] min: 3.24341 ms
[10/13/2020-18:37:54] [I] max: 13.7832 ms
[10/13/2020-18:37:54] [I] median: 3.96033 ms
[10/13/2020-18:37:54] [I] GPU Compute
[10/13/2020-18:37:54] [I] min: 37.1077 ms
[10/13/2020-18:37:54] [I] max: 38.5801 ms
[10/13/2020-18:37:54] [I] mean: 37.3572 ms
[10/13/2020-18:37:54] [I] median: 37.2079 ms
[10/13/2020-18:37:54] [I] percentile: 38.5801 ms at 99%
[10/13/2020-18:37:54] [I] total compute time: 3.06329 s
Segmentation fault (core dumped)