VSS Blueprint only returns partial responses for video chunks

Hello!

I’m using the VSS blueprint 2.4.0 solely for dense video captioning with VLMs, mainly to take advantage of the auto-chunking feature.

However, when I send videos, I only receive responses for the first few chunks.
I noticed that warnings appear on the VSS server, and it seems to fail when processing the remaining chunks.

This happens both using vLLm and API based VLM.

Could I get some help with this issue?
The video I tested was 20 seconds long with a resolution of 532×280.
Each chunk is 2 seconds long, and I sampled 20 frames per chunk.

Console log:

via-server-1 | 2025-11-08 11:07:11,967 DEBUG chunk.end_pts=18000000000, len(self._selected_pts_array)=16
via-server-1 | 2025-11-08 11:07:11,967 DEBUG chunk.end_pts=19833000000, len(self._selected_pts_array)=16
via-server-1 | 2025-11-08 11:07:11,967 DEBUG Decode start=1762600031.9483736 end=1762600031.9669528
via-server-1 | 2025-11-08 11:07:11,967 DEBUG sampled frame num: 0, chunk: Chunk 7: start=14.0 end=16.0 file=/tmp/assets/4146dd7a-8e8d-4a2e-84f5-f22e75da7556/video_11.mp4, gpu_id: 0
via-server-1 | 2025-11-08 11:07:11,968 WARNING No frames found for chunk Chunk 7: start=14.0 end=16.0 file=/tmp/assets/4146dd7a-8e8d-4a2e-84f5-f22e75da7556/video_11.mp4
via-server-1 | 2025-11-08 11:07:11,968 STATUS Chunk (Chunk 7: start=14.0 end=16.0 file=/tmp/assets/4146dd7a-8e8d-4a2e-84f5-f22e75da7556/video_11.mp4) decoded, frames=0
via-server-1 | 2025-11-08 11:07:11,968 STATUS Chunk (Chunk 4: start=8.0 end=10.0 file=/tmp/assets/4146dd7a-8e8d-4a2e-84f5-f22e75da7556/video_11.mp4) decoded, frames=9
via-server-1 | 2025-11-08 11:07:11,968 DEBUG sampled frame num: 0, chunk: Chunk 5: start=10.0 end=12.0 file=/tmp/assets/4146dd7a-8e8d-4a2e-84f5-f22e75da7556/video_11.mp4, gpu_id: 0
via-server-1 | 2025-11-08 11:07:11,969 WARNING No frames found for chunk Chunk 5: start=10.0 end=12.0 file=/tmp/assets/4146dd7a-8e8d-4a2e-84f5-f22e75da7556/video_11.mp4
via-server-1 | 2025-11-08 11:07:11,969 STATUS Chunk (Chunk 5: start=10.0 end=12.0 file=/tmp/assets/4146dd7a-8e8d-4a2e-84f5-f22e75da7556/video_11.mp4) decoded, frames=0

Debug mode log file:

2025-11-08 11:07:11,916 e[94mINFOe[0m Received generate_vlm_captions query, id - 4146dd7a-8e8d-4a2e-84f5-f22e75da7556 (live-stream=0), chunk_duration=2, chunk_overlap_duration=0, media-offset-type=None, media-start-time=None, media-end-time=None, modelParams={“max_tokens”: null, “temperature”: null, “top_p”: null, “top_k”: null}, stream=False num_frames_per_chunk=0 vlm_input_width = 0, vlm_input_height = 0, cv_pipeline_prompt = , enable_cv_metadata = 0, enable_reasoning = 0

2025-11-08 11:07:11,925 e[94mINFOe[0m Triggering oldest queued query b2226f2e-ff7c-4aed-8f55-3ecc3fe1920b

2025-11-08 11:07:11,942 e[95mPERFe[0m File Split execution time = 9.939 millisec

2025-11-08 11:07:11,943 e[94mSTATUSe[0m Chunk (Chunk 0: start=0.0 end=2.0 file=/tmp/assets/4146dd7a-8e8d-4a2e-84f5-f22e75da7556/video_11.mp4) decode starting

2025-11-08 11:07:11,943 e[94mINFOe[0m Created video file query b2226f2e-ff7c-4aed-8f55-3ecc3fe1920b for videoId 4146dd7a-8e8d-4a2e-84f5-f22e75da7556

2025-11-08 11:07:11,943 e[94mSTATUSe[0m Chunk (Chunk 1: start=2.0 end=4.0 file=/tmp/assets/4146dd7a-8e8d-4a2e-84f5-f22e75da7556/video_11.mp4) decode starting

2025-11-08 11:07:11,943 e[94mINFOe[0m Waiting for results of query b2226f2e-ff7c-4aed-8f55-3ecc3fe1920b

2025-11-08 11:07:11,943 e[94mINFOe[0m Status for query b2226f2e-ff7c-4aed-8f55-3ecc3fe1920b is processing, percent complete is 0.00, size of response list is 0

2025-11-08 11:07:11,943 e[94mSTATUSe[0m Chunk (Chunk 2: start=4.0 end=6.0 file=/tmp/assets/4146dd7a-8e8d-4a2e-84f5-f22e75da7556/video_11.mp4) decode starting

2025-11-08 11:07:11,943 e[94mSTATUSe[0m Chunk (Chunk 3: start=6.0 end=8.0 file=/tmp/assets/4146dd7a-8e8d-4a2e-84f5-f22e75da7556/video_11.mp4) decode starting

2025-11-08 11:07:11,943 e[94mSTATUSe[0m Chunk (Chunk 4: start=8.0 end=10.0 file=/tmp/assets/4146dd7a-8e8d-4a2e-84f5-f22e75da7556/video_11.mp4) decode starting

2025-11-08 11:07:11,944 e[94mSTATUSe[0m Chunk (Chunk 5: start=10.0 end=12.0 file=/tmp/assets/4146dd7a-8e8d-4a2e-84f5-f22e75da7556/video_11.mp4) decode starting

2025-11-08 11:07:11,944 e[94mSTATUSe[0m Chunk (Chunk 6: start=12.0 end=14.0 file=/tmp/assets/4146dd7a-8e8d-4a2e-84f5-f22e75da7556/video_11.mp4) decode starting

2025-11-08 11:07:11,944 e[94mSTATUSe[0m Chunk (Chunk 7: start=14.0 end=16.0 file=/tmp/assets/4146dd7a-8e8d-4a2e-84f5-f22e75da7556/video_11.mp4) decode starting

2025-11-08 11:07:11,960 e[95mPERFe[0m Decode execution time = 12.434 millisec

2025-11-08 11:07:11,960 e[95mPERFe[0m Decode execution time = 13.149 millisec

2025-11-08 11:07:11,962 e[93mWARNINGe[0m No frames found for chunk Chunk 6: start=12.0 end=14.0 file=/tmp/assets/4146dd7a-8e8d-4a2e-84f5-f22e75da7556/video_11.mp4

2025-11-08 11:07:11,962 e[93mWARNINGe[0m No frames found for chunk Chunk 2: start=4.0 end=6.0 file=/tmp/assets/4146dd7a-8e8d-4a2e-84f5-f22e75da7556/video_11.mp4

2025-11-08 11:07:11,963 e[94mSTATUSe[0m Chunk (Chunk 6: start=12.0 end=14.0 file=/tmp/assets/4146dd7a-8e8d-4a2e-84f5-f22e75da7556/video_11.mp4) decoded, frames=0

2025-11-08 11:07:11,964 e[94mINFOe[0m Processed chunk for query b2226f2e-ff7c-4aed-8f55-3ecc3fe1920b, total chunks 10, processed chunks 1, chunk Chunk 6: start=12.0 end=14.0 file=/tmp/assets/4146dd7a-8e8d-4a2e-84f5-f22e75da7556/video_11.mp4,

2025-11-08 11:07:11,963 e[94mSTATUSe[0m Chunk (Chunk 2: start=4.0 end=6.0 file=/tmp/assets/4146dd7a-8e8d-4a2e-84f5-f22e75da7556/video_11.mp4) decoded, frames=0

2025-11-08 11:07:11,964 e[94mINFOe[0m Processed chunk for query b2226f2e-ff7c-4aed-8f55-3ecc3fe1920b, total chunks 10, processed chunks 2, chunk Chunk 2: start=4.0 end=6.0 file=/tmp/assets/4146dd7a-8e8d-4a2e-84f5-f22e75da7556/video_11.mp4,

2025-11-08 11:07:11,964 e[94mSTATUSe[0m Chunk (Chunk 8: start=16.0 end=18.0 file=/tmp/assets/4146dd7a-8e8d-4a2e-84f5-f22e75da7556/video_11.mp4) decode starting

2025-11-08 11:07:11,965 e[95mPERFe[0m Decode execution time = 17.117 millisec

2025-11-08 11:07:11,965 e[94mSTATUSe[0m Chunk (Chunk 9: start=18.0 end=19.833 file=/tmp/assets/4146dd7a-8e8d-4a2e-84f5-f22e75da7556/video_11.mp4) decode starting

2025-11-08 11:07:11,966 e[95mPERFe[0m Decode execution time = 18.724 millisec

2025-11-08 11:07:11,966 e[95mPERFe[0m Decode execution time = 18.579 millisec

2025-11-08 11:07:11,968 e[93mWARNINGe[0m No frames found for chunk Chunk 7: start=14.0 end=16.0 file=/tmp/assets/4146dd7a-8e8d-4a2e-84f5-f22e75da7556/video_11.mp4

2025-11-08 11:07:11,968 e[94mSTATUSe[0m Chunk (Chunk 7: start=14.0 end=16.0 file=/tmp/assets/4146dd7a-8e8d-4a2e-84f5-f22e75da7556/video_11.mp4) decoded, frames=0

2025-11-08 11:07:11,968 e[94mSTATUSe[0m Chunk (Chunk 4: start=8.0 end=10.0 file=/tmp/assets/4146dd7a-8e8d-4a2e-84f5-f22e75da7556/video_11.mp4) decoded, frames=9

2025-11-08 11:07:11,969 e[93mWARNINGe[0m No frames found for chunk Chunk 5: start=10.0 end=12.0 file=/tmp/assets/4146dd7a-8e8d-4a2e-84f5-f22e75da7556/video_11.mp4

2025-11-08 11:07:11,969 e[94mSTATUSe[0m Chunk (Chunk 5: start=10.0 end=12.0 file=/tmp/assets/4146dd7a-8e8d-4a2e-84f5-f22e75da7556/video_11.mp4) decoded, frames=0

2025-11-08 11:07:11,969 e[94mINFOe[0m Processed chunk for query b2226f2e-ff7c-4aed-8f55-3ecc3fe1920b, total chunks 10, processed chunks 3, chunk Chunk 7: start=14.0 end=16.0 file=/tmp/assets/4146dd7a-8e8d-4a2e-84f5-f22e75da7556/video_11.mp4,

2025-11-08 11:07:11,970 e[94mINFOe[0m Processed chunk for query b2226f2e-ff7c-4aed-8f55-3ecc3fe1920b, total chunks 10, processed chunks 4, chunk Chunk 5: start=10.0 end=12.0 file=/tmp/assets/4146dd7a-8e8d-4a2e-84f5-f22e75da7556/video_11.mp4,

2025-11-08 11:07:11,971 e[95mPERFe[0m Decode execution time = 25.066 millisec

2025-11-08 11:07:11,971 e[95mPERFe[0m Decode execution time = 23.931 millisec

2025-11-08 11:07:11,971 e[94mSTATUSe[0m Chunk (Chunk 0: start=0.0 end=2.0 file=/tmp/assets/4146dd7a-8e8d-4a2e-84f5-f22e75da7556/video_11.mp4) decoded, frames=16

2025-11-08 11:07:11,971 e[95mPERFe[0m Decode execution time = 26.712 millisec

2025-11-08 11:07:11,972 e[94mSTATUSe[0m Chunk (Chunk 3: start=6.0 end=8.0 file=/tmp/assets/4146dd7a-8e8d-4a2e-84f5-f22e75da7556/video_11.mp4) decoded, frames=16

2025-11-08 11:07:11,972 e[94mSTATUSe[0m Chunk (Chunk 1: start=2.0 end=4.0 file=/tmp/assets/4146dd7a-8e8d-4a2e-84f5-f22e75da7556/video_11.mp4) decoded, frames=16

2025-11-08 11:07:11,973 e[95mPERFe[0m Decode execution time = 4.966 millisec

2025-11-08 11:07:11,974 e[95mPERFe[0m Decode execution time = 4.832 millisec

2025-11-08 11:07:11,974 e[93mWARNINGe[0m No frames found for chunk Chunk 8: start=16.0 end=18.0 file=/tmp/assets/4146dd7a-8e8d-4a2e-84f5-f22e75da7556/video_11.mp4

2025-11-08 11:07:11,974 e[93mWARNINGe[0m No frames found for chunk Chunk 9: start=18.0 end=19.833 file=/tmp/assets/4146dd7a-8e8d-4a2e-84f5-f22e75da7556/video_11.mp4

2025-11-08 11:07:11,974 e[94mSTATUSe[0m Chunk (Chunk 8: start=16.0 end=18.0 file=/tmp/assets/4146dd7a-8e8d-4a2e-84f5-f22e75da7556/video_11.mp4) decoded, frames=0

2025-11-08 11:07:11,974 e[94mSTATUSe[0m Chunk (Chunk 9: start=18.0 end=19.833 file=/tmp/assets/4146dd7a-8e8d-4a2e-84f5-f22e75da7556/video_11.mp4) decoded, frames=0

2025-11-08 11:07:11,974 e[94mINFOe[0m Processed chunk for query b2226f2e-ff7c-4aed-8f55-3ecc3fe1920b, total chunks 10, processed chunks 5, chunk Chunk 8: start=16.0 end=18.0 file=/tmp/assets/4146dd7a-8e8d-4a2e-84f5-f22e75da7556/video_11.mp4,

2025-11-08 11:07:11,975 e[94mINFOe[0m Processed chunk for query b2226f2e-ff7c-4aed-8f55-3ecc3fe1920b, total chunks 10, processed chunks 6, chunk Chunk 9: start=18.0 end=19.833 file=/tmp/assets/4146dd7a-8e8d-4a2e-84f5-f22e75da7556/video_11.mp4,

2025-11-08 11:07:11,975 e[94mSTATUSe[0m Generating VLM response for (Chunk 3: start=6.0 end=8.0 file=/tmp/assets/4146dd7a-8e8d-4a2e-84f5-f22e75da7556/video_11.mp4)

2025-11-08 11:07:11,977 e[94mSTATUSe[0m Generating VLM response for (Chunk 4: start=8.0 end=10.0 file=/tmp/assets/4146dd7a-8e8d-4a2e-84f5-f22e75da7556/video_11.mp4)

2025-11-08 11:07:12,076 e[94mSTATUSe[0m Generating VLM response for (Chunk 0: start=0.0 end=2.0 file=/tmp/assets/4146dd7a-8e8d-4a2e-84f5-f22e75da7556/video_11.mp4)

2025-11-08 11:07:12,170 e[94mSTATUSe[0m Generating VLM response for (Chunk 1: start=2.0 end=4.0 file=/tmp/assets/4146dd7a-8e8d-4a2e-84f5-f22e75da7556/video_11.mp4)

2025-11-08 11:07:12,944 e[94mINFOe[0m Status for query b2226f2e-ff7c-4aed-8f55-3ecc3fe1920b is processing, percent complete is 54.00, size of response list is 0

2025-11-08 11:07:13,944 e[94mINFOe[0m Status for query b2226f2e-ff7c-4aed-8f55-3ecc3fe1920b is processing, percent complete is 54.00, size of response list is 0

2025-11-08 11:07:14,945 e[94mINFOe[0m Status for query b2226f2e-ff7c-4aed-8f55-3ecc3fe1920b is processing, percent complete is 54.00, size of response list is 0

2025-11-08 11:07:15,665 e[95mPERFe[0m TRT generate execution time = 3.589 sec

2025-11-08 11:07:15,665 e[94mSTATUSe[0m VLM response generated for (Chunk 4: start=8.0 end=10.0 file=/tmp/assets/4146dd7a-8e8d-4a2e-84f5-f22e75da7556/video_11.mp4), [

{

output

]

2025-11-08 11:07:15,666 e[94mINFOe[0m Processed chunk for query b2226f2e-ff7c-4aed-8f55-3ecc3fe1920b, total chunks 10, processed chunks 7, chunk Chunk 4: start=8.0 end=10.0 file=/tmp/assets/4146dd7a-8e8d-4a2e-84f5-f22e75da7556/video_11.mp4,

2025-11-08 11:07:15,945 e[94mINFOe[0m Status for query b2226f2e-ff7c-4aed-8f55-3ecc3fe1920b is processing, percent complete is 63.00, size of response list is 0

2025-11-08 11:07:16,002 e[95mPERFe[0m TRT generate execution time = 3.832 sec

2025-11-08 11:07:16,002 e[94mSTATUSe[0m VLM response generated for (Chunk 3: start=6.0 end=8.0 file=/tmp/assets/4146dd7a-8e8d-4a2e-84f5-f22e75da7556/video_11.mp4), [

***

]

2025-11-08 11:07:16,003 e[94mINFOe[0m Processed chunk for query b2226f2e-ff7c-4aed-8f55-3ecc3fe1920b, total chunks 10, processed chunks 8, chunk Chunk 3: start=6.0 end=8.0 file=/tmp/assets/4146dd7a-8e8d-4a2e-84f5-f22e75da7556/video_11.mp4,

2025-11-08 11:07:16,207 e[95mPERFe[0m TRT generate execution time = 3.407 sec

2025-11-08 11:07:16,208 e[94mSTATUSe[0m VLM response generated for (Chunk 1: start=2.0 end=4.0 file=/tmp/assets/4146dd7a-8e8d-4a2e-84f5-f22e75da7556/video_11.mp4), [

***

]

2025-11-08 11:07:16,209 e[94mINFOe[0m Processed chunk for query b2226f2e-ff7c-4aed-8f55-3ecc3fe1920b, total chunks 10, processed chunks 9, chunk Chunk 1: start=2.0 end=4.0 file=/tmp/assets/4146dd7a-8e8d-4a2e-84f5-f22e75da7556/video_11.mp4,

2025-11-08 11:07:16,945 e[94mINFOe[0m Status for query b2226f2e-ff7c-4aed-8f55-3ecc3fe1920b is processing, percent complete is 81.00, size of response list is 0

2025-11-08 11:07:17,137 e[95mPERFe[0m TRT generate execution time = 4.543 sec

2025-11-08 11:07:17,138 e[94mSTATUSe[0m VLM response generated for (Chunk 0: start=0.0 end=2.0 file=/tmp/assets/4146dd7a-8e8d-4a2e-84f5-f22e75da7556/video_11.mp4), [

***

]

2025-11-08 11:07:17,139 e[94mINFOe[0m Processed chunk for query b2226f2e-ff7c-4aed-8f55-3ecc3fe1920b, total chunks 10, processed chunks 10, chunk Chunk 0: start=0.0 end=2.0 file=/tmp/assets/4146dd7a-8e8d-4a2e-84f5-f22e75da7556/video_11.mp4,

2025-11-08 11:07:17,139 e[94mINFOe[0m Processed all chunks for query b2226f2e-ff7c-4aed-8f55-3ecc3fe1920b, VLM pipeline time 5.21 sec

2025-11-08 11:07:17,139 e[94mINFOe[0m Generating summary for request b2226f2e-ff7c-4aed-8f55-3ecc3fe1920b

2025-11-08 11:07:17,140 e[95mPERFe[0m Chunk Processing - Filter and Sort execution time = 168.324 usec

2025-11-08 11:07:17,140 e[94mINFOe[0m Summary generated for video file request b2226f2e-ff7c-4aed-8f55-3ecc3fe1920b, total processing time - 5.21 seconds, summary

2025-11-08 11:07:17,141 e[95mPERFe[0m POST /generate_vlm_captions execution time = 5.226 sec

Solved it! Thank you!