Hi Kyle, yes unfortunately MLC only supports sm_80 and newer (hence those tutorials on Jetson AI Lab that use NanoLLM only list compatibility with Orin).
Since that post you found, exllamav2 has gotten faster than llama.cpp, but I’m not sure if exllama is limited to sm_80+ also (there are containers for that here)
And also since that post, there is now Ollama support on Jetson, which also uses llama.cpp underneath but is easier to use, so if you are starting out you may consider that as an option as well.