Thank you for all the amazing work with the Jetson containers and the Generative AI Lab. I have been playing around with llamaspeak these days and have come across an issue.
I have successfully installed the latest version of NVIDIA Riva (v2.13.0) and followed your documentation about llamaspeak. I am using an AGX Orin 32GB device flashed with JetPack 5.1.1 and using the model: https://huggingface.co/TheBloke/Llama-2-7b-Chat-GPTQ
I am using a PC on the same network as the Jetson, and use a web browser on the PC. There is Mic/ Speaker connected to the PC.
When I run the demo, it works well for couple of minutes. However, after asking several questions, it stops getting the input from the Mic. To fix this, I need to restart the llamaspeak Docker container. However, at this state, if I enter text in the textbox, it accepts the input and gives the response.
Thanks @lakshantha.d, did you leave the mic muted for awhile, or was this as you were speaking? Recently I found the Riva ASR request to have a idle timeout of 1000 seconds in the Triton server, so in the next version I send keep-alive audio before that.
I have tested again. If I keep it without muting, no issues for some time. But keeping on for a while, it enters the state where it doesn’t register the voice input.
Also, only muting for some time and unmuting bring up the issue.
OK yes - for now, just restart llamaspeak when that happens - you should not have to restart the whole Riva container. I think the ASR detects a difference when the audio samples are literally all zeros (muted) versus “silence” (close to zero)