Live Llava NanoDB reindexing with different vision encoder

lazziya · December 9, 2024, 2:58pm

Continuing the discussion from Live LLaVA webUI don't show NanoDB webUI:

Live LLaVA webUI don't show NanoDB webUI

Ok gotcha @masaki_yamagishi, I realize what is going on now: VILA-2.7B used the same openai/clip-vit-large-patch14-336 vision model that the NanoDB was created with, however VILA1.5-3B uses a SigLIP vision encoder that it custom-trained, and the embedding dimensions are different.

I will have to do some rework of NanoDB to support using arbitrary embedding models, and the database will need to be re-indexed with the particular model the VLM is using (should you want to reuse the embeddings and not have to recalculate them)

In the nearer term, I will have to add a flag to the VideoQuery agent to disable reusing the embeddings, and then NanoDB will go back to calculating them with the original CLIP model. Until then, unfortunately I would go back to using VILA-2.7B if you require the live NanoDB integration, sorry about that.

Hi, @dusty_nv. Can you help me with this?

I want to re-index my Coco data with SigLIP vision encoder because I need the embedding size compatible with VILA1.5-3B. Another reason is Live LLaVA does not work fine with dustynv/nano_llm 24.7-r36.2.0 image but dustynv/nano_llm 24.5-r36.2.0 is quite good.
I usually get outputs like this with 24.7 →

Another reason is Live LLaVA does not work fine with dustynv/nano_llm 24.7-r36.2.0 image but dustynv/nano_llm 24.5-r36.2.0 is quite good.

This is an another problem but for now I just want to know if there is a way to reindex my dataset with another visual encoder.

Topic		Replies	Views
Live LLaVA webUI don't show NanoDB webUI Jetson AGX Orin generative_ai	10	565	May 14, 2024
Cannot use VILA on jetson-containers Jetson AGX Orin generative_ai	4	482	June 4, 2024
Can't start the live llava on jetson orin nano developer kit Jetson Orin Nano generative_ai	9	952	June 4, 2024
NanoVLM Issue on Jetson Orin Nano Jetson Orin Nano generative_ai	9	836	June 6, 2024
NanoDB Crushing on startup Jetson AGX Orin	4	234	April 25, 2024
What almost everyone with a nano is looking for Jetson Nano	65	6695	October 15, 2021
Some question about Deep stream 5 DeepStream SDK	42	2137	October 12, 2021
Live LLaVA not work Jetson Orin Nano generative_ai	2	317	May 10, 2024
Accelerating Peoplnet with tlt for jetson nano TAO Toolkit	19	2590	October 12, 2021
New VILA-1.5 multimodal vision/language models released in 3B, 8B, 13B, 40B Jetson Projects generative_ai	0	1671	May 3, 2024

Live Llava NanoDB reindexing with different vision encoder

Related topics