Mlc-llm used in NanoLLM

Hi, @dusty_nv
I saw mlc-llm is imported in nano_llm, and used to quantify weights, but tvm is used for inference. Why? Is it for higher performance, or is it just a mlc-llm version issue, or something else?

Hi @siyu_ok , think of TVM like the vector/tensor library or neural-net layer library that supports underneath MLC. MLC implements the LLM network specifications from those primitives provided by TVM. TVM is shipped as a submodule of MLC, they are compiled and installed in the same steps.