Summarization API issues

ganadev · April 1, 2025, 8:54pm

Please provide the following information when creating a topic:

Hardware Platform (GPU model and numbers) : brev.dev launchable on CRUSOE
System Memory : 128GB ( unconfirmed )
Ubuntu Version : unknown
NVIDIA GPU Driver Version (valid for GPU only) : unknown
Issue Type : Question

While testing out the summerization api, it does not seem to follow completely the documentation provided. I believe some api params are not expected by the api.
The following is the params provided in the API documentation,

I believe it was failing because of the values in rag_type, rag_batch_size, etc.
This is what worked for us,
{
“id”: “5e223c04-c32d-4cfb-a6be-227ec5aa5f98”,
“prompt”: “Write a concise and clear dense caption for the provided video”,
“model”: “nvila”,
“stream”: true,

“stream_options”: {
“include_usage”: true
},
“max_tokens”: 512,
“temperature”: 0.4,
“top_p”: 1,
“top_k”: 100,
“seed”: 1,
“num_frames_per_chunk”: 10,
“vlm_input_width”: 0,
“vlm_input_height”: 0,
“chunk_duration”: 60,
“summary_duration”: 60,
“caption_summarization_prompt”: “Prompt for caption summarization”,
“summarize”: true,
“enable_chat”: false
}
Verified with the code from the via-engine container>

Requirement (vss-engine):

After initiating the summarization for live stream via API, its response is a stream.
Is there a way to just trigger the summarization for live stream without the response as a stream. We could use the chat completion API to query the vector database for the summaries.
There is a limit of 256 live stream summarization currently possible. How can I increase this limit?

yuweiw · April 2, 2025, 10:19am

The possible reason is the model used. Now you can only use NVILA. There may be some problems when you are downloading the VILA model.

About the Requirement:

Could you attach how you use this API in detail?
How do you get the limit of 256 live stream summarization?

ganadev · April 2, 2025, 2:16pm

We were using curl to add the streams and then summarize. The following is the code to summarize.

image1203×498 18.3 KB

image1265×217 6.26 KB
We were able to successfully add 500 streams, but only 256 streams were able to be actively summarized by the earlier code. We looked into gradio UI to verify the active streams, and it only showed 256 active streams eventhough there were 500 streams going through the vss engine, which we confirmed with the tmp/assets folder (it had 500 uniques). The summarization performance started droping after 256 active streams too. But gradio ui kept showing only 256 active streams when we refresh active stream list. It failed to summarize for the rest of the 500 streams we added.

We later on, went through the vlm_pipeline.py file, and found this code, and it gave us the impression, there is a default limit set.

ganadev · April 8, 2025, 8:18pm

Is there any update on this?

yuweiw · April 9, 2025, 1:18am

About the 1st issue:

Trigger summarize using /summarize API.
You can terminate this connection, VIA backend will keep summarizing the live stream in the background
You can then do Q&A (/chat/completions)
One point: vector DB will keep updating periodically as stream continues

About the 2nd issue:

You can add following in env using overrides before deploying:

...
  env:
  - name: VSS_EXTRA_ARGS
    value: "--max-live-streams 512"

Topic		Replies	Views
Questions regarding Video Search and Summarization Application Visual AI Agent	6	75	April 1, 2025
VLM Summarization and AINVR Emdx Analytics Error Metropolis Microservices for Jetson	4	41	February 5, 2025
Live stream summary documentation Visual AI Agent	3	38	April 14, 2025
Feature Request : Support HTTP stream URL for live video summarization Visual AI Agent	3	61	March 14, 2025
Riva: Node.JS Examples Riva riva	4	2226	April 15, 2022
RTSP Credentials not accepting special characters Visual AI Agent nvbugs	8	130	March 23, 2025
Build VLM-Powered Visual AI Agents Using NVIDIA NIM and NVIDIA VIA Microservices Technical Blog nim	3	100	August 28, 2024
The vss is stuck in "processing the video" Visual AI Agent	7	27	April 27, 2025
No result when specifying offline mode and streaming=False Riva	3	721	July 23, 2023
VILA with VIA [New] Visual AI Agent demos-and-tutorials , llama	4	997	December 24, 2024

Summarization API issues

Related topics