Failed to MLC-compile mlc-ai/Llama-3.1-8B-Instruct-fp8-MLC on Jetson AGX orin

I’ve tried using mlc-ai/Llama-3.1-8B-Instruct-fp8-MLC on Jetson AGX orin but it fails at the compilation step.

I’ve raised an issue in MLC community but I’m posting it here as well hoping if I could get an insight from Jetson side.

Hi,
Here are some suggestions for the common issues:

1. Performance

Please run the below command before benchmarking deep learning use case:

$ sudo nvpmodel -m 0
$ sudo jetson_clocks

2. Installation

Installation guide of deep learning frameworks on Jetson:

3. Tutorial

Startup deep learning tutorial:

4. Report issue

If these suggestions don’t help and you want to report an issue to us, please attach the model, command/step, and the customized app (if any) with us to reproduce locally.

Thanks!

Hi,

Please find below for a sample to run MLC on the Orin.

Thanks.

I’ve verified that it’s working with quantization=q416_ft but not with 8bit quantization methods.

It fails at the compilation step

Hi,

It looks like the model can work with q4 but fails with 8-bit quantization.
So the failure might be caused by the model requiring more memory resources than the Jetson device has.

You can verify this by monitoring the system with tegrastats:

$ sudo tegrastats

Thanks.