Pytorch model to trt engine, sometimes crash

when i try to convert pytorch model to trt engine, sometimes the program crashes, sometimes pytorch model can convert to engine successfully, and it can get true result. crash code position(m_Builder->buildEngineWithConfig).
windows10, tensorrt6.0.1.5, cuda10.0 cudnn7.6 gpu:TITANX12GB
[
Applying generic optimizations to the graph for inference.
Original: 260 layers
After dead-layer removal: 260 layers
Fusing convolution weights from (Unnamed Layer* 0) [Convolution] with scale (Unnamed Layer* 1) [Scale]
Fusing convolution weights from (Unnamed Layer* 4) [Convolution] with scale (Unnamed Layer* 5) [Scale]
Fusing convolution weights from (Unnamed Layer* 7) [Convolution] with scale (Unnamed Layer* 8) [Scale]
Fusing convolution weights from (Unnamed Layer* 10) [Convolution] with scale (Unnamed Layer* 11) [Scale]
Fusing convolution weights from (Unnamed Layer* 12) [Convolution] with scale (Unnamed Layer* 13) [Scale]
Fusing convolution weights from (Unnamed Layer* 16) [Convolution] with scale (Unnamed Layer* 17) [Scale]
Fusing convolution weights from (Unnamed Layer* 19) [Convolution] with scale (Unnamed Layer* 20) [Scale]
Fusing convolution weights from (Unnamed Layer* 22) [Convolution] with scale (Unnamed Layer* 23) [Scale]
Fusing convolution weights from (Unnamed Layer* 26) [Convolution] with scale (Unnamed Layer* 27) [Scale]
Fusing convolution weights from (Unnamed Layer* 29) [Convolution] with scale (Unnamed Layer* 30) [Scale]
Fusing convolution weights from (Unnamed Layer* 32) [Convolution] with scale (Unnamed Layer* 33) [Scale]
Fusing convolution weights from (Unnamed Layer* 36) [Convolution] with scale (Unnamed Layer* 37) [Scale]
Fusing convolution weights from (Unnamed Layer* 39) [Convolution] with scale (Unnamed Layer* 40) [Scale]
Fusing convolution weights from (Unnamed Layer* 42) [Convolution] with scale (Unnamed Layer* 43) [Scale]
Fusing convolution weights from (Unnamed Layer* 44) [Convolution] with scale (Unnamed Layer* 45) [Scale]
Fusing convolution weights from (Unnamed Layer* 48) [Convolution] with scale (Unnamed Layer* 49) [Scale]
Fusing convolution weights from (Unnamed Layer* 51) [Convolution] with scale (Unnamed Layer* 52) [Scale]
Fusing convolution weights from (Unnamed Layer* 54) [Convolution] with scale (Unnamed Layer* 55) [Scale]
Fusing convolution weights from (Unnamed Layer* 58) [Convolution] with scale (Unnamed Layer* 59) [Scale]
Fusing convolution weights from (Unnamed Layer* 61) [Convolution] with scale (Unnamed Layer* 62) [Scale]
Fusing convolution weights from (Unnamed Layer* 64) [Convolution] with scale (Unnamed Layer* 65) [Scale]
Fusing convolution weights from (Unnamed Layer* 68) [Convolution] with scale (Unnamed Layer* 69) [Scale]
Fusing convolution weights from (Unnamed Layer* 71) [Convolution] with scale (Unnamed Layer* 72) [Scale]
Fusing convolution weights from (Unnamed Layer* 74) [Convolution] with scale (Unnamed Layer* 75) [Scale]
Fusing convolution weights from (Unnamed Layer* 78) [Convolution] with scale (Unnamed Layer* 79) [Scale]
Fusing convolution weights from (Unnamed Layer* 81) [Convolution] with scale (Unnamed Layer* 82) [Scale]
Fusing convolution weights from (Unnamed Layer* 84) [Convolution] with scale (Unnamed Layer* 85) [Scale]
Fusing convolution weights from (Unnamed Layer* 86) [Convolution] with scale (Unnamed Layer* 87) [Scale]
Fusing convolution weights from (Unnamed Layer* 90) [Convolution] with scale (Unnamed Layer* 91) [Scale]
Fusing convolution weights from (Unnamed Layer* 93) [Convolution] with scale (Unnamed Layer* 94) [Scale]
Fusing convolution weights from (Unnamed Layer* 96) [Convolution] with scale (Unnamed Layer* 97) [Scale]
Fusing convolution weights from (Unnamed Layer* 100) [Convolution] with scale (Unnamed Layer* 101) [Scale]
Fusing convolution weights from (Unnamed Layer* 103) [Convolution] with scale (Unnamed Layer* 104) [Scale]
Fusing convolution weights from (Unnamed Layer* 106) [Convolution] with scale (Unnamed Layer* 107) [Scale]
Fusing convolution weights from (Unnamed Layer* 110) [Convolution] with scale (Unnamed Layer* 111) [Scale]
Fusing convolution weights from (Unnamed Layer* 113) [Convolution] with scale (Unnamed Layer* 114) [Scale]
Fusing convolution weights from (Unnamed Layer* 116) [Convolution] with scale (Unnamed Layer* 117) [Scale]
Fusing convolution weights from (Unnamed Layer* 120) [Convolution] with scale (Unnamed Layer* 121) [Scale]
Fusing convolution weights from (Unnamed Layer* 123) [Convolution] with scale (Unnamed Layer* 124) [Scale]
Fusing convolution weights from (Unnamed Layer* 126) [Convolution] with scale (Unnamed Layer* 127) [Scale]
Fusing convolution weights from (Unnamed Layer* 130) [Convolution] with scale (Unnamed Layer* 131) [Scale]
Fusing convolution weights from (Unnamed Layer* 133) [Convolution] with scale (Unnamed Layer* 134) [Scale]
Fusing convolution weights from (Unnamed Layer* 136) [Convolution] with scale (Unnamed Layer* 137) [Scale]
Fusing convolution weights from (Unnamed Layer* 140) [Convolution] with scale (Unnamed Layer* 141) [Scale]
Fusing convolution weights from (Unnamed Layer* 143) [Convolution] with scale (Unnamed Layer* 144) [Scale]
Fusing convolution weights from (Unnamed Layer* 146) [Convolution] with scale (Unnamed Layer* 147) [Scale]
Fusing convolution weights from (Unnamed Layer* 148) [Convolution] with scale (Unnamed Layer* 149) [Scale]
Fusing convolution weights from (Unnamed Layer* 152) [Convolution] with scale (Unnamed Layer* 153) [Scale]
Fusing convolution weights from (Unnamed Layer* 155) [Convolution] with scale (Unnamed Layer* 156) [Scale]
Fusing convolution weights from (Unnamed Layer* 158) [Convolution] with scale (Unnamed Layer* 159) [Scale]
Fusing convolution weights from (Unnamed Layer* 162) [Convolution] with scale (Unnamed Layer* 163) [Scale]
Fusing convolution weights from (Unnamed Layer* 165) [Convolution] with scale (Unnamed Layer* 166) [Scale]
Fusing convolution weights from (Unnamed Layer* 168) [Convolution] with scale (Unnamed Layer* 169) [Scale]
After scale fusion: 207 layers
Fusing (Unnamed Layer* 0) [Convolution] with (Unnamed Layer* 2) [Activation]
Fusing (Unnamed Layer* 4) [Convolution] with (Unnamed Layer* 6) [Activation]
Fusing (Unnamed Layer* 7) [Convolution] with (Unnamed Layer* 9) [Activation]
Fusing (Unnamed Layer* 10) [Convolution] with (Unnamed Layer* 14) [ElementWise]
Fusing (Unnamed Layer* 10) [Convolution] + (Unnamed Layer* 14) [ElementWise] with (Unnamed Layer* 15) [Activation]
Fusing (Unnamed Layer* 16) [Convolution] with (Unnamed Layer* 18) [Activation]
Fusing (Unnamed Layer* 19) [Convolution] with (Unnamed Layer* 21) [Activation]
Fusing (Unnamed Layer* 22) [Convolution] with (Unnamed Layer* 24) [ElementWise]
Fusing (Unnamed Layer* 22) [Convolution] + (Unnamed Layer* 24) [ElementWise] with (Unnamed Layer* 25) [Activation]
Fusing (Unnamed Layer* 26) [Convolution] with (Unnamed Layer* 28) [Activation]
Fusing (Unnamed Layer* 29) [Convolution] with (Unnamed Layer* 31) [Activation]
Fusing (Unnamed Layer* 32) [Convolution] with (Unnamed Layer* 34) [ElementWise]
Fusing (Unnamed Layer* 32) [Convolution] + (Unnamed Layer* 34) [ElementWise] with (Unnamed Layer* 35) [Activation]
Fusing (Unnamed Layer* 36) [Convolution] with (Unnamed Layer* 38) [Activation]
Fusing (Unnamed Layer* 39) [Convolution] with (Unnamed Layer* 41) [Activation]
Fusing (Unnamed Layer* 42) [Convolution] with (Unnamed Layer* 46) [ElementWise]
Fusing (Unnamed Layer* 42) [Convolution] + (Unnamed Layer* 46) [ElementWise] with (Unnamed Layer* 47) [Activation]
Fusing (Unnamed Layer* 48) [Convolution] with (Unnamed Layer* 50) [Activation]
Fusing (Unnamed Layer* 51) [Convolution] with (Unnamed Layer* 53) [Activation]
Fusing (Unnamed Layer* 54) [Convolution] with (Unnamed Layer* 56) [ElementWise]
Fusing (Unnamed Layer* 54) [Convolution] + (Unnamed Layer* 56) [ElementWise] with (Unnamed Layer* 57) [Activation]
Fusing (Unnamed Layer* 58) [Convolution] with (Unnamed Layer* 60) [Activation]
]
and program crashes!
i cant debug.

Hi, Request you to share the ONNX model and the script so that we can assist you better.

Alongside you can try validating your model with the below snippet

check_model.py

import sys
import onnx
filename = yourONNXmodel
model = onnx.load(filename)
onnx.checker.check_model(model).

Alternatively, you can try running your model with trtexec command.

Thanks!

Hi @501267319,

You can convert pytorch model to tesorrt via onnx. We recommend to you to use latest trt version.
For your reference,
pytorch - onnx
https://pytorch.org/tutorials/advanced/super_resolution_with_onnxruntime.html
onnx-tensorrt

Thank you.

i do not use onnx model ,my pytorch model is a weights file, and i use tensorrt api to construct the network.
model address:retinanet.wts - Google Drive
engine converter code:engine_shift_noqt.rar - Google Drive

i debug the error with visual studio, i ignore the bug when the error occurs, i still can get the trt engine,and get the true result, i dont know if i can ignore the bug. thank you for your help.Looking forward to your reply!

Hi @501267319,

Could you please try on latest trt 7.x release.

Thank you.

thank you for you reply. same problem in trt7.2. what should i do?

Hi @501267319,

Please allow us some time to work on this issue.

Thank you.