Offline/Batch broken on 1.8b due to 900s limit

ShantanuNair · December 22, 2021, 8:57am

Riva v1.8b
AWS g4dn.xlarge T4 16GiB

Curious about what exactly is the change from “streaming offline mode” to “true offline mode” with the 1.8b release, and why the change. Can I recreate the previous offline mode (prior to 1v.8beta) by simply using the streaming inference model with similar chunk sizes?

Is this change in batch processing documented? I did not see it in the release notes.

@rleary Could you offer some insight, please? Thanks to your previous assistance we were able to get really nice long form transcriptions with the increase gRPC message sizes and the older “streaming offline mode”.

ShantanuNair · December 22, 2021, 10:04am

Relevant Logs:

 09:58:40.630826   346 grpc_riva_asr.cc:447] ASRService.Recognize called.
 09:58:40.660948   346 riva_asr_stream.cc:213] Detected format: encoding = 1 numchannels = 1 samplerate = 16000 bitspersample = 16
 09:58:40.662431   346 grpc_riva_asr.cc:519] ASRService.Recognize performing streaming recognition with sequence id: 31625700
 09:58:40.662487   346 grpc_riva_asr.cc:537] Using model citrinet-1024-en-US-asr-offline for inference
 09:58:40.662544   346 grpc_riva_asr.cc:552] Model sample rate= 16000 for inference
 09:58:40.662636   346 grpc_riva_asr.cc:583] Error: Audio duration (1376.962524s) is longer than maximum supported audio duration in offline mode (900.0s)

Notice also here that it says "RService.Recognize performing streaming recognition with sequence id: "

If I’m correct that should now say offline recognition, correct?

rleary · December 22, 2021, 3:07pm

Hi @ShantanuNair. Sorry for the trouble this has caused you - I agree we can make the release notes more clear about this change. I’ll get that updated. Good catch on the log message as well.

We did, indeed, change the behavior of the offline API in this release. With the Jasper/QuartzNet/CitriNet model family, there is always at least some accuracy degradation when performing streaming inference (including the previous ‘streaming offline’ implementation). In some cases after fine-tuning, this degradation can be drastic - we continue to research this behavior. By switching to true batch processing, we recover this lost accuracy. In a future release, we will also be able to increase the throughput of the offline endpoint due to these changes.

Depending on your deployment configuration (number of models, GPU memory), you may be able to increase the maximum input size by regenerating the RMIR and re-deploying. We documented the procedure to replicate our model deployments in this release: Riva — NVIDIA Riva. Note the CitriNet Offline portion of the table, specifically --chunk_size=900 --left_padding_size=0. --right_padding_size=0.. You can modify the chunk_size to meet your requirements, assuming the model will continue to fit in memory. We will be looking into opportunities to minimize the disruption in future releases as well (e.g. running VAD and splitting the input audio if necessary).

Hope this helps. Please let me know if there’s any other information we can provide to assist you with your deployment.

ShantanuNair · December 28, 2021, 11:05am

@rleary thank you! I followed up here Rebuilding the asrset3 citrinet offline pipeline but with larger chunk size

Topic		Replies	Views
Does Riva 1.8.0b0 automatically build both streaming and offline speech recognition models if the --offline flag is passed to riva-build? Riva	3	809	March 20, 2022
Rebuilding the asrset3 citrinet offline pipeline but with larger chunk size Riva	10	1480	February 16, 2022
No result when specifying offline mode and streaming=False Riva	3	797	July 23, 2023
Inference Broken - Long Form Audio and gRPC max message sizes Riva	10	2439	October 18, 2021
gRPC Message sizes Riva	2	982	September 22, 2021
Final transcript is empty on streaming mode Riva	5	781	December 22, 2022
Riva Quickstart 2.2.1 offline en-US models missing Riva	3	1148	July 4, 2022
There is a phenomenon that occurs with offline video and RTSP streams that I don't understand DeepStream SDK	6	425	November 1, 2022
I can’t get the transcript using offline ASR Recognize in Node js Riva riva	4	657	August 24, 2021
Does canary not support live transcription/streaming? Riva	3	403	January 23, 2025

Offline/Batch broken on 1.8b due to 900s limit

Related topics