Hello,
I am currently working on a project using the Jetson AGX Orin development board, which runs on an ARM64 architecture. I am using a specific version of PyTorch: 1.12.0a0+2c916ef.nv22.3
, provided by NVIDIA.
I encountered an issue with PyTorch quantization support on this platform. When attempting to check available quantization engines using:
print(torch.backends.quantized.supported_engines)
the output is ['None']
, indicating no available quantization engines. Additionally, attempting to set the quantization backend to qnnpack
resulted in the following error:
RuntimeError: quantized engine QNNPACK is not supported
Given that fbgemm
is typically not supported on ARM64, I am wondering:
- Does this specific PyTorch version for Jetson AGX Orin support any quantization backend?
- Are there any available quantization engines for this platform that I might have missed?
- What are the recommended approaches for performing quantization on the Jetson AGX Orin with PyTorch?
I would appreciate any insights or solutions, especially in terms of how to enable or use quantization on the Jetson AGX Orin.
Thank you for your assistance!