Hardware -
GPU (T4 on an AWS EC2 instance) 16GB
CPU - Intel® Xeon® Scalable (Cascade Lake) 4 x vCPU
Operating System:
Ubuntu 18 LTS with Nvidia DL AMI
Riva Version:
v1.5.0-beta
How to reproduce the issue ?
- Have riva set up and running with citrinet models offline+streaming models in config.sh - in my case via the Quickstart and on an EC2 instance
- Run the Riva client on the same instance by running bash riva_start_client.sh.
- Copy a larger than 4MiB audio file into the riva-client container and try to process it using the streaming model. This fails with a gRPC error stating that the message received is larger than the max message size. Here’s some output:
>riva_asr_client --max_alternatives=0 --audio-file=./wav/working_files/longform_test.wav --print_transcripts=true
Loading eval dataset...
filename: /work/wav/working_files/longform_test.wav
Done loading 1 files
RPC failed: Received message larger than max (150952067 vs. 4194304)
Done processing 1 responses
Some requests failed to complete properly, not printing performance stats
150952067 bytes = 144MiB
4194304 bytes = 4Mib
A few considerations, and things that I’ve tried — it seems like you can pass in an options object when setting up the gRPC channel client-side and the server can decide if it wants to respect that message size setting. I tried this by adding it in riva_quickstart/examples/transcribe_file_offline.py to the --server argument.
parser.add_argument("--server", default="localhost:50051", options=[("grpc.max_receive_message_length", 1024*1024*1024 ), ("grpc.max_send_message_length", 1024*1024*1024 )], type=str, help="URI to GRPC server endpoint")
However the server responds with a gRPC error Error received from peer ipv6:[::1]:50051
and the following stack trace is printed:
>python3 transcribe_file_offline.py --audio-file=/work/wav/working_files/longform_test.wav
Traceback (most recent call last):
File "transcribe_file_offline.py", line 62, in <module>
response = client.Recognize(request)
File "/usr/local/lib/python3.8/dist-packages/grpc/_channel.py", line 946, in __call__
return _end_unary_response_blocking(state, call, False, None)
File "/usr/local/lib/python3.8/dist-packages/grpc/_channel.py", line 849, in _end_unary_response_blocking
raise _InactiveRpcError(state)
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
status = StatusCode.RESOURCE_EXHAUSTED
details = "Received message larger than max (150952038 vs. 4194304)"
debug_error_string = "{"created":"@1631874680.490127888","description":"Error received from peer ipv6:[::1]:50051","file":"src/core/lib/surface/call.cc","file_line":1069,"grpc_message":"Received message larger than max (150952038 vs. 4194304)","grpc_status":8}"
IMO this is breaks inference on long-form audio. Can we expect this to be fixed/changed?