## 🐛 Bug
As the title says, compiling mlc-ai/Llama-3.1-8B-Instruct-fp8-MLC …on Jetson AGX orin fails.
## To Reproduce
Steps to reproduce the behavior:
1. Clone mlc-ai/Llama-3.1-8B-Instruct-fp8-MLC (with git lfs)
```
git clone https://huggingface.co/mlc-ai/Llama-3.1-8B-Instruct-fp8-MLC
```
1. Run the compile command
```
mlc_llm compile dist/Llama-3.1-8B-Instruct-fp8-MLC/ -o dist/Llama-3.1-8B-Instruct-fp8-MLC/lib.so
```
The error message
```
Traceback (most recent call last):
File "/usr/local/bin/mlc_llm", line 8, in <module>
sys.exit(main())
File "/usr/local/lib/python3.10/dist-packages/mlc_llm/__main__.py", line 33, in main
cli.main(sys.argv[2:])
File "/usr/local/lib/python3.10/dist-packages/mlc_llm/cli/compile.py", line 129, in main
compile(
File "/usr/local/lib/python3.10/dist-packages/mlc_llm/interface/compile.py", line 243, in compile
_compile(args, model_config)
File "/usr/local/lib/python3.10/dist-packages/mlc_llm/interface/compile.py", line 188, in _compile
args.build_func(
File "/usr/local/lib/python3.10/dist-packages/mlc_llm/support/auto_target.py", line 311, in build
relax.build(
File "/usr/local/lib/python3.10/dist-packages/tvm/relax/vm_build.py", line 353, in build
return _vmlink(
File "/usr/local/lib/python3.10/dist-packages/tvm/relax/vm_build.py", line 249, in _vmlink
lib = tvm.build(
File "/usr/local/lib/python3.10/dist-packages/tvm/driver/build_module.py", line 297, in build
rt_mod_host = _driver_ffi.tir_to_runtime(annotated_mods, target_host)
File "tvm/_ffi/_cython/./packed_func.pxi", line 339, in tvm._ffi._cy3.core.PackedFuncBase.__call__
File "tvm/_ffi/_cython/./packed_func.pxi", line 270, in tvm._ffi._cy3.core.FuncCall
File "tvm/_ffi/_cython/./packed_func.pxi", line 259, in tvm._ffi._cy3.core.FuncCall3
File "tvm/_ffi/_cython/./base.pxi", line 185, in tvm._ffi._cy3.core.CHECK_CALL
File "/usr/local/lib/python3.10/dist-packages/tvm/_ffi/base.py", line 481, in raise_last_ffi_error
raise py_err
tvm.error.InternalError: Traceback (most recent call last):
[bt] (8) /usr/local/lib/python3.10/dist-packages/tvm/libtvm.so(tvm::runtime::Array<tvm::tir::Stmt, std::enable_if<std::is_base_of<tvm::runtime::ObjectRef, tvm::tir::Stmt>::value, void>::type> tvm::tir::StmtMutator::Internal::MutateArray<tvm::tir::Stmt, tvm::tir::StmtMutator::Internal::Mutate(tvm::tir::StmtMutator*, tvm::runtime::Array<tvm::tir::Stmt, void> const&)::{lambda(tvm::tir::Stmt const&)#1}>(tvm::tir::StmtMutator*, tvm::runtime::Array<tvm::tir::Stmt, std::enable_if<std::is_base_of<tvm::runtime::ObjectRef, tvm::tir::Stmt>::value, void>::type> const&, tvm::tir::StmtMutator::Internal::Mutate(tvm::tir::StmtMutator*, tvm::runtime::Array<tvm::tir::Stmt, void> const&)::{lambda(tvm::tir::Stmt const&)#1})+0x7c) [0xffff6b7213cc]
[bt] (7) /usr/local/lib/python3.10/dist-packages/tvm/libtvm.so(tvm::runtime::ObjectPtr<tvm::runtime::Object> tvm::runtime::Array<tvm::tir::Stmt, void>::MapHelper<tvm::tir::StmtMutator::Internal::Mutate(tvm::tir::StmtMutator*, tvm::runtime::Array<tvm::tir::Stmt, void> const&)::{lambda(tvm::tir::Stmt const&)#1}, tvm::tir::Stmt>(tvm::runtime::ObjectPtr<tvm::runtime::Object>, tvm::tir::StmtMutator::Internal::Mutate(tvm::tir::StmtMutator*, tvm::runtime::Array<tvm::tir::Stmt, void> const&)::{lambda(tvm::tir::Stmt const&)#1})+0x3d8) [0xffff6b7211b8]
[bt] (6) /usr/local/lib/python3.10/dist-packages/tvm/libtvm.so(tvm::tir::StmtMutator::VisitStmt(tvm::tir::Stmt const&)+0x78) [0xffff6a883578]
[bt] (5) /usr/local/lib/python3.10/dist-packages/tvm/libtvm.so(tvm::tir::StmtFunctor<tvm::tir::Stmt (tvm::tir::Stmt const&)>::VisitStmt(tvm::tir::Stmt const&)+0xec) [0xffff6a88346c]
[bt] (4) /usr/local/lib/python3.10/dist-packages/tvm/libtvm.so(tvm::tir::StmtFunctor<tvm::tir::Stmt (tvm::tir::Stmt const&)>::InitVTable()::{lambda(tvm::runtime::ObjectRef const&, tvm::tir::StmtFunctor<tvm::tir::Stmt (tvm::tir::Stmt const&)>*)#15}::_FUN(tvm::runtime::ObjectRef const&, tvm::tir::StmtFunctor<tvm::tir::Stmt (tvm::tir::Stmt const&)>*)+0x3c) [0xffff6a874cbc]
[bt] (3) /usr/local/lib/python3.10/dist-packages/tvm/libtvm.so(tvm::tir::ComputeLegalizer::VisitStmt_(tvm::tir::BufferStoreNode const*)+0x5c8) [0xffff6bc1c6f8]
[bt] (2) /usr/local/lib/python3.10/dist-packages/tvm/libtvm.so(+0x261ada8) [0xffff6bc0ada8]
[bt] (1) /usr/local/lib/python3.10/dist-packages/tvm/libtvm.so(tvm::runtime::detail::LogFatal::Entry::Finalize()+0x68) [0xffff6a857ce8]
[bt] (0) /usr/local/lib/python3.10/dist-packages/tvm/libtvm.so(tvm::runtime::Backtrace[abi:cxx11]()+0x30) [0xffff6c7f7190]
File "/opt/mlc-llm/3rdparty/tvm/src/tir/transforms/unsupported_dtype_legalize.cc", line 330
InternalError: Check failed: (MatchDType(value->dtype)) is false:
```
## Expected behavior
The command should produce `.so` file
## Environment
- Platform (e.g. WebGPU/Vulkan/IOS/Android/CUDA): CUDA
- Operating system (e.g. Ubuntu/Windows/MacOS/...): Ubuntu 22.04
- Device (e.g. iPhone 12 Pro, PC+RTX 3090, ...): Jetson AGX orin
- How you installed MLC-LLM (`conda`, source): As a docker image [dustynv/mlc:0.1.4-r36.4.2](https://hub.docker.com/r/dustynv/mlc/tags)
- How you installed TVM-Unity (`pip`, source): ^^
- Python version (e.g. 3.10): Python 3.10.12
- GPU driver version (if applicable):
- CUDA/cuDNN version (if applicable): CUDA 12.6.68 / cuDNN 9.3.0.75
- TVM Unity Hash Tag (`python -c "import tvm; print('\n'.join(f'{k}: {v}' for k, v in tvm.support.libinfo().items()))"`, applicable if you compile models): 3f30919055d864af3dd03c42b3cb0a878aa2cc25
- Any other relevant information:
## Additional context