Error with pytorch model without BN fusing when running QAT?

I follow this guide for yolov7 qat

In this repo BN layer in fused into Conv layer before ptq. Now I don’t want to fuse BN layer into Conv layers. I comment out 2 lines but when running I got an error

Traceback (most recent call last):
  File "scripts/", line 338, in <module>
  File "scripts/", line 179, in cmd_quantize
    quantize.apply_custom_rules_to_quantizer(model, export_onnx)
  File "/yolov7_custom_dataset/quantization/", line 222, in apply_custom_rules_to_quantizer
    export_onnx(model, "quantization-custom-rules-temp.onnx")
  File "scripts/", line 138, in export_onnx
    quantize.export_onnx(model, dummy, file, opset_version=13, 
  File "/yolov7_custom_dataset/quantization/", line 394, in export_onnx
    torch.onnx.export(model, input, file, *args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/torch/onnx/", line 506, in export
  File "/usr/local/lib/python3.8/dist-packages/torch/onnx/", line 1548, in _export
    graph, params_dict, torch_out = _model_to_graph(
  File "/usr/local/lib/python3.8/dist-packages/torch/onnx/", line 1180, in _model_to_graph
    params_dict = _C._jit_pass_onnx_constant_fold(
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper_CUDA__index_select)

Do you any have suggesstion for me? this was happen when convert to onnx


We are moving this post to the Deepstream forum to get better help.

Thank you.

why don’t you want to fuse BN layer into Conv layers?


Thanks for response.

As I mentioned above, in this repo BN layer in fused into Conv layer with model.fuse() before ptq. But after ptq, fine tuning is processed. I want to keep statistics of BN layer (got from the training process), it may be results in better mAP. To do that, I comment out model.fuse(), run ptq and qat finetuning, but converting from .onnx model to .engine model I got the above error.

I do not think we can got better mAP if not fuse BN. QAT just do finetune. will not greatly change the weight. So fuse BN is the better option. See paper:

