Jetson-containers local_llm not working with Jetson AGX Xavier


I cannot get jetson-container local_llm to work with Jetson AGX Xavier. I have it working with Orin, so I know that I am not running something incorrectly, and also found on other discussions that api=mlc might not work with Xavier. I tried other apis and none of them works. Only one that runs is hf, but the moment I feed a text input/prompt, it pops out embed_txt not implemented error:

File “/usr/lib/python3.8/”, line 932, in _bootstrap_inner
File “/opt/local_llm/local_llm/”, line 177, in run
File “/opt/local_llm/local_llm/”, line 190, in dispatch
outputs = self.process(input)
File “/opt/local_llm/local_llm/plugins/”, line 126, in process
embedding, position = chat_history.embed_chat()
File “/opt/local_llm/local_llm/”, line 292, in embed_chat
entry[embed_key] = self.embed(entry[key], type=key, template=role_template)
File “/opt/local_llm/local_llm/”, line 190, in embed
return self.embedding_functions[type].func(input, template)
File “/opt/local_llm/local_llm/”, line 201, in embed_text
embedding = self.model.embed_text(text, use_cache=use_cache)
File “/opt/local_llm/local_llm/”, line 116, in embed_text
raise NotImplementedError(“embed_text() not implemented for this model”)
NotImplementedError: embed_text() not implemented for this model

Is there any way to make it work @dusty_nv ?

Thank you.

1 Like

Hi @haechan.bong, sorry about that, yes the other APIs I need to update at some point (but they are significantly slower versus MLC, which is why I have not prioritized their upkeep). And the kernels used by MLC have a requirement for Ampere architecture and newer, so sadly Xavier is not supported in that one. Currently I would recommend to use llama.cpp on Xavier.

1 Like