I executed this on my device
jetson-containers run $(autotag nano_llm) \
python3 -m nano_llm.chat --api=mlc \
--model Efficient-Large-Model/VILA1.5-3b \
--max-context-len 256 \
--max-new-tokens 32
output was this
log.txt (6.9 KB)
I executed this on my device
jetson-containers run $(autotag nano_llm) \
python3 -m nano_llm.chat --api=mlc \
--model Efficient-Large-Model/VILA1.5-3b \
--max-context-len 256 \
--max-new-tokens 32
output was this
log.txt (6.9 KB)
Hi,
Could you try --model meta-llama/Llama-2-7b-chat-hf
to see if it works?
Thanks.
Cannot access gated repo for url https://huggingface.co/api/models/meta-llama/Llama-2-7b-chat-hf/revision/main.
Access to model meta-llama/Llama-2-7b-chat-hf is restricted. You must be authenticated to access it.
@s209109 to run the original Llama models, you need to request access through your HuggingFace account, make an API token, and set HUGGINGFACE_TOKEN environment variable like shown here:
But regarding VILA-1.5, can you try pulling the latest nano_llm container image?
docker pull $(autotag nano_llm)
And if that still doesn’t fix it, try adding --vision-api=hf
to the command-line arguments.
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.