New VILA-1.5 multimodal vision/language models released in 3B, 8B, 13B, 40B

dusty_nv · May 3, 2024, 6:21pm

We’ve released new VILA models with improved accuracy and speed - up to 7.5 FPS on Orin!

These are supported in the latest 24.5 release of NanoLLM:

If you already have the nano_llm container on your system, do a docker pull dustynv/nano_llm:r36.2.0 (or r35.4.1) and then you should be able to run this along with the other VLM demos:

jetson-containers run $(autotag nano_llm) \
  python3 -m nano_llm.chat --api=mlc \
    --model Efficient-Large-Model/VILA1.5-3b \
    --prompt /data/prompts/images.json

It now also uses TensorRT to accelerate the CLIP/SigLIP vision encoder in the pipeline 👍

Topic		Replies	Views
Visual Language Intelligence and Edge AI 2.0 Technical Blog	2	202	May 3, 2024
Small LLMs and Mini VLMs on Orin Nano Jetson Projects generative_ai	0	1299	March 5, 2024
Llama 3.2 Full-Stack Optimizations Unlock High Performance on NVIDIA GPUs Technical Blog llama	1	62	November 19, 2024
Just Released: NVIDIA VILA VLM Technical Blog	1	39	December 9, 2024
엣지에서 클라우드로 가속화된 Llama 3.2 배포하기 Technical Blog - South Korea llama	1	23	September 30, 2024
Local_llm vs NanoLLM: Help Getting NanoLLM up & running Jetson Orin Nano generative_ai	7	961	April 17, 2024
NVIDIA Jetson Orin Nano 개발자 키트, “슈퍼” 부스트 Technical Blog - South Korea jetson	1	44	December 20, 2024
Cannot use VILA on jetson-containers Jetson AGX Orin generative_ai	4	373	June 4, 2024
Deploying Accelerated Llama 3.2 from the Edge to the Cloud Technical Blog llama	1	64	September 25, 2024
VILA 1.5 3B on Jetson Orin Nano Jetson Orin Nano jetson-inference , inception , generative_ai	4	655	June 5, 2024

New VILA-1.5 multimodal vision/language models released in 3B, 8B, 13B, 40B

Related topics