NVIDIA Jetson Run Quantized Neural Network

Description

I have a Jetson and it gives me an error when I try to load a quantized neural network (made on my laptop), and the same error when I try to quantize the model on the Jetson, i.e. quantizing won’t work on my Jetson.

>>> torch.jit.load('../test')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/adam/Desktop/ship-env/lib/python3.6/site-packages/torch/jit/_serialization.py", line 161, in load
    cpp_module = torch._C.import_ir_module(cu, str(f), map_location, _extra_files)
RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript, serialized code (most recent call last):
  File "code/__torch__/torch/nn/quantized/modules/conv.py", line 35, in __setstate__
    self.groups = (state)[8]
    self.padding_mode = (state)[9]
    _1 = (self).set_weight_bias((state)[10], (state)[11], )
          ~~~~~~~~~~~~~~~~~~~~~ <--- HERE
    self.scale = (state)[12]
    self.zero_point = (state)[13]
  File "code/__torch__/torch/nn/quantized/modules/conv.py", line 56, in set_weight_bias
      _12 = [_10, _11]
      _13, _14, = _5
      _15 = ops.quantized.conv2d_prepack(w, b, _9, _12, [_13, _14], _6)
            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
      self._packed_params = _15
    else:

Traceback of TorchScript, original code (most recent call last):
  File "/home/adam/.local/lib/python3.9/site-packages/torch/nn/quantized/modules/conv.py", line 177, in __setstate__
        self.groups = state[8]
        self.padding_mode = state[9]
        self.set_weight_bias(state[10], state[11])
        ~~~~~~~~~~~~~~~~~~~~ <--- HERE
        self.scale = state[12]
        self.zero_point = state[13]
  File "/home/adam/.local/lib/python3.9/site-packages/torch/nn/quantized/modules/conv.py", line 402, in set_weight_bias
    def set_weight_bias(self, w: torch.Tensor, b: Optional[torch.Tensor]) -> None:
        if self.padding_mode == 'zeros':
            self._packed_params = torch.ops.quantized.conv2d_prepack(
                                  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
                w, b, self.stride, self.padding, self.dilation, self.groups)
        else:
RuntimeError: Didn't find engine for operation quantized::conv2d_prepack NoQEngine

Environment

GPU Type: Jetson AGX
Jetpack Version: 4.4.1
Operating System + Version: Ubuntu
Python Version (if applicable): 3.6.9
PyTorch Version (if applicable): 1.9

Steps To Reproduce

model_fp = torch.load(models_dir+net_file)

model_to_quant = copy.deepcopy(model_fp)
model_to_quant.eval()
model_to_quant = quantize_fx.fuse_fx(model_to_quant)

qconfig_dict = {"": torch.quantization.get_default_qconfig('qnnpack')}

model_prepped = quantize_fx.prepare_fx(model_to_quant, qconfig_dict)
model_quantised = quantize_fx.convert_fx(model_prepped)

Hi,

Do you use the same PyTorch version to load the model on the desktop?
Thanks.

Hi,

I do use the same version. But I considered that this might be the issue, and tried to quantize the model from scratch on the Jetson, and I get the same error about no engine being found.
Thanks for the reply.

Hi,

Would you mind sharing the model with us so we can check it further?
Thanks.

The link here is to the jits of the original fp model and the jit of the quantized model (quantized jit made on my desktop). Making and running the quantized model works fine of my desktop CPU. But the Jetson will not make or run the quantized model.

https://drive.google.com/drive/folders/1glakaz-2rq54pkllZ5jlloejI6Ya6kES?usp=sharing

If you need anything else please let me know. Thank you for your help.

@AastaLLL have you had a chance to look?

Hi,

Thanks for your sharing.

We are checking this internally.
Will update more information later.

1 Like

Thank you!

Hi,

This issue can be reproduced internally with JetPack 4.6 + l4t-pytorch container.
We are checking this in deep. Will share more information with you later.

Thanks.

Thank you very much for your help.

Same issue here. Is there a solution?