Local deployment error of SpeechSquad sample

follow the steps of SpeechSquad — NVIDIA Jarvis Speech Skills v1.0.0-b.3 documentation, but got error from server side:

I0508 13:36:26.292239 1 resources.cc:62] jarvis asr connection established to 0.0.0.0:50051
I0508 13:36:26.292706 1 resources.cc:63] jarvis nlp connection established to 0.0.0.0:50051
I0508 13:36:26.292716 1 resources.cc:64] jarvis tts connection established to 0.0.0.0:50051
I0508 13:36:26.315240 1 server.cc:102] grpc server and event loop initialized and accepting connections
E0508 13:38:21.191830 9 context.cc:172] asr error detected - issuing cancellation on squad stream
E0508 13:38:21.195380 10 context.cc:172] asr error detected - issuing cancellation on squad stream
E0508 13:38:21.195479 13 context.cc:172] asr error detected - issuing cancellation on squad stream
E0508 13:38:21.195513 11 context.cc:172] asr error detected - issuing cancellation on squad stream
E0508 13:38:21.195511 12 context.cc:172] asr error detected - issuing cancellation on squad stream

what’s the reason?

Hi @jackhe
Could you please share the log files and system info so we can help better?

Thanks

Hey @SunilJB. I get the same error that @Jackhe saw following those steps. I was trying to get some idea of jarvis scaling using a Tesla T4. I saw in the notes speech squad is not able to get latency info with the jarvis server. I will keep posted to see when this is supported.

Do you know where I could get some ranges on the number of conversations (2 audio channels) I can expect using just ASR or ASR/NLP or ASR/NLP/TTS without web framework constraints. The plan is to only use native python probably using aiortc to support webrtc audio sources.

I ran nvidia-bug-report.sh and a text file with server and client output
nvidia-bug-report.log.gz (744.5 KB)
speechsquad_T4.txt (3.6 KB)
. If there is something else to gather data let me know. So far Jarvis is great!

jarvis-speech:1.1.0-beta-server
speech_squad:1.0.0-b.1

Hi @rmcinnis1 ,

It seems speedsquad by default process “quartznet-asr-trt-ensemble-vad-streaming”.

Could you please check if jarvis server has quartznet model running?
Else you can use something like-asr_model_name jasper-asr-trt-ensemble-vad-streaming argument along with other arguments (-tts_service_url) to point to Jasper ASR model.

Thanks

1 Like

Thanks! @SunilJB. I checked config.sh for the speech server and quartznet was commented out so I included the option you provided and now I get results including latency. This might help me get data points on scale for the project. I have a lot of learning to do though.
ubuntu@ip-172-31-27-91:~/nvidia/speechsquad$ sudo docker run -it --net=host -v $(pwd)/speechsquad_sample_public_v1:/work/test_files/speech_squad/ nvcr.io/nvidia/jarvis/speech_squad:1.0.0-b.1 speechsquad_perf_client --squad_questions_json=/work/test_files/speech_squad/recorded_questions.jl --squad_dataset_json=/work/test_files/speech_squad/manifest.json --speech_squad_uri=0.0.0.0:1337 --chunk_duration_ms=800 --executor_count=1 --num_iterations=1 --num_parallel_requests=64 --print_results=false
Loading eval dataset…
Done loading 5 files for process 0
Generating load…
…Waiting for all responses…

Done with measurements
Generating Statistics Report…
================ Process 0================




tracing.speech_squad.asr_latency (ms):
Median 90th 95th 99th Avg
250.15 350.82 350.82 350.82 224.57

tracing.speech_squad.nlp_latency (ms):
Median 90th 95th 99th Avg
9.427 114.33 114.33 114.33 31.705

tracing.speech_squad.tts_latency (ms):
Median 90th 95th 99th Avg
128.33 291.36 291.36 291.36 150.93

Client Latency (ms):
Median 90th 95th 99th Avg
439.66 515.62 515.62 515.62 408.55
================ Final Report ================
Run time: 4.8552 sec.
Total audio processed: 17.811 sec.
Throughput: 3.6683 RTFX
Number of failed audio clips: 0
Average Latencies ====>
Client Latency:408.55 ms
tracing.server_latency.natural_query:0 ms
tracing.server_latency.speech_synthesis:0 ms
tracing.server_latency.streaming_recognition:0 ms
tracing.speech_squad.asr_latency:224.57 ms
tracing.speech_squad.nlp_latency:31.705 ms
tracing.speech_squad.tts_latency:150.93 ms

1 Like