Can someone tell me how to quantize using pytorch on a jetson device?

I’m using the jetson agx orin device to go through some experiments with transformer model quantization, and I followed the “PyTorch for Jetson “I followed the tutorial in this link to install pytorch, my cuda version is 11.4 and jetpack version is 5.1.1, but I found that torch.backends.quantized.engine returns none, how do I use the quantization that comes with pytorch?

Hi

Could you check torch.backends.quantized.supported_engines?
We can get qnnpack listed with the PyTorch installed from here for JetPack 6.2:

$ python3
Python 3.10.12 (main, Nov  6 2024, 20:22:13) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.__version__
'2.6.0-rc1'
>>> print(torch.backends.quantized.supported_engines)
['qnnpack', 'none']

Thanks.

Thanks for sharing this. It is very helpful. By the way, what is your cuda version? My cuda version is 11.4 and cuDNN version is 8.6.0. Do I have to upgrade my cuda version along with it? Is there an official nvidia jetson tutorial for installing cuda if needed?

Hi, I tried to upgrade my jetpack. i tried using the command “pip install torch==2.6.0 torchvision torchaudio --extra-index-url https://developer.download. nvidia.com/compute/redist/jp/v62
“to install torch. it does have the quantization engine, but my output torch.version returns 2.6.0+cpu, and torch.cuda.is_available() returns false. may I ask how your version 2.6.0-rc1 was installed?

Hi,

Please find it in the below link:

Thanks.

Thanks.It helps a lot!

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.