Hi,
Request you to share the ONNX model and the script if not shared already so that we can assist you better.
Alongside you can try few things:
validating your model with the below snippet
check_model.py
import sys
import onnx
filename = yourONNXmodel
model = onnx.load(filename)
onnx.checker.check_model(model).
2) Try running your model with trtexec command.
In case you are still facing issue, request you to share the trtexec “”–verbose"" log for further debugging
Thanks!
models are checked with your script, and all passed, trtexec did not show any error or warning with --verbose flag.
I wonder why trtexec performance get lower after pruning, thanks.
What does prune mean here? Reducing the number of channels for some convolution layers? If so, then this behavior is sometimes expected because TensorCore requires padding the channel dimensions to multiples of 8 (for FP16) or 32 (for INT8). If the pruned model does not have these “nice numbers” of channels, additional padding may be required and perf may drop.
The guideline is when doing channel pruning, please prune the channel to multiples of 32.