INT4 on Jetson-AGX-Orin or Jetson-Orin-Nano?

Hello,

Do the Nvidia Jetson Orin series support INT4 operations?

states that the Jetson AGX Orin supports FP32 on the tensor cores and that the Orin contains 3rd generation tensor cores.

I believe the third generation Ampere cores support INT4 operations (The arithmetic login units are the same as the A100 GPUs).

Can INT4 operation support on the Jetson Orins be confirmed and whether it extends to all the cores or just the Tensor cores?

Hi,

For hardware capability, Orin can support INT8 Tensor Core operation. (IMMA and HMMA)
INT4 usually refers to software-level quantization.

TensorRT supports INT4 data type for weight compression.
This is available on AGX Orin but please upgrade TensorRT to 10.x up:
https://developer.nvidia.com/downloads/compute/machine-learning/tensorrt/10.3.0/local_repo/nv-tensorrt-local-tegra-repo-ubuntu2204-10.3.0-cuda-12.6_1.0-1_arm64.deb

TensorRT-LLM also supports INT4 precision.
But TensorRT-LLM is not available for Jetson yet.
https://nvidia.github.io/TensorRT-LLM/reference/support-matrix.html#software

Thanks.

1 Like

Thank you.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.