[HELP] Can I use local model to load LLM and start the Agent studio?

Hello Nvidia

Here’s the thing. We recently keep working on Jetson Orin projects. When I decided to copy my environment to another Orin. I just wonder can we use local model to load such as VILA or Llama 3 something like this.It’s more fast that I just use my USB-Flash to store these huge models to reduce the downloading time and copy to the other Orin devices.

For examples :

Agent studio

jetson-containers run --env HUGGINGFACE_TOKEN=hf_xyz123abc456 \ $(autotag nano_llm) \ python3 -m nano_llm.studio

Llama-speak

jetson-containers run --env HUGGINGFACE_TOKEN=hf_xyz123abc456 \ $(autotag nano_llm) \ python3 -m nano_llm.agents.web_chat --api=mlc \ --model meta-llama/Meta-Llama-3-8B-Instruct \ --asr=riva --tts=piper

**In the instruction of Agent studio or llamaspeak, if I want to load the local VILA or Llama 3 model directly without having to authenticate with Huggingface !

How can I modify code or instruction to achieve that effect ?

Thanks !

Best regards,
Leonard

Hi @leonard.zhang, you can pass a local path to --model (or the corresponding field in Agent Studio) instead of HuggingFace repo name, and if the directory already exists it will just load your specified folder instead of trying to download it from HuggingFace Hub.

Hi dusty

Thanks for your answering and I try it out that add the local path to the properties, which is totally work.

And a quite interesting thing is that when I start try to use local ASR/TTS model path under offline situation.
I find the model expansion property of PiperTTS is .onnx but the RivaTTS is .riva according to RIVA Official Documentation

.onnx

.riva

My question is does there any limitation to use Local pretrained Text to Speech Model and which files should be modified, cause Orin as a edge computing deivce, it will always works under offline workingfield.

Thanks !

Best regards,
Leonard

Hi Leonard, you should get your agent running first while connected to internet, so that all the models get downloaded and are cached on disk (or you can do this manually and specify their local paths). Then when you disconnect the internet, it already has the models onboard. There may be some minor things here & there that you can disable the requests for.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.