Integrate voice commands and robots can reply back by voice

Dear All
I have completed the demo with Jetbot as instructed.

How to integrate voice commands and robots can reply back by voice.

I use Jetson nano development kit

THanks for your help

This is complicated. I have actually come pretty close to that, but your first step is to get sound working. See the threads related to sound processing on i2s that requires a new dtb.

For speaking back:
THe easiest way to get a Linux machine like the Nano to speak is to install “espeak”

sudo apt-get install espeak
You can then run a shell command:
“espeak ‘hello world’” and it speaks. It is, however, quite robotic.

A better system for speech is Festival, especially if you go through the trouble to download additional high-quality voices from various research centers.

sudo apt-get install festival
You can then:
festival ‘(SayText “hello world”)’
or perhaps:
echo “Hello World” | festival --tts

You can then use these functions from a Python script using the “subprocess” module, or you can use the libraries/servers that come with the packages to call directly from C. Check the open source sites and documentation for those packages for more details.

Voice recognition is harder, because the good models (Google Assistant, Amazon Alexa, Apple Siri, Microsoft Cortana) aren’t publicly available, only on their respective ecosystems.
Those models all benefit from previous research, of which the “sphinx” system is probably the best documented.
You could download the software and an appropriate language model from Carnegie-Mellon:
This will let you build tools and libraries, that you in turn can use from your own program.

@snarky Have you tested adding high quality voices to festival?

I tried it last night on the Nano, and it didn’t work. For some reason festival wouldn’t recognize the voices I installed (all the CMU Arctic ones). Then I tried essentially the same thing on a desktop machine, and it worked fine.

I was going to try again on the nano (using just one CMU voice like I did on my PC), but then I encountered espeak. For my needs I actually kinda like the British Female voice on it (the F3 one).

It works great on the Nano, and seems to be highly configurable.

For the audio I used the Adafruit speaker bonnet. It plugs right onto the Nano, and the dtb on the following instructions worked fine.

I copied the dtb over AFTER the SDK OS Image build process finished, and right before the image was written. I tested it with Jetpack 4.2, and haven’t tested it with the latest Jetpack 4.2.1

Here is a link to the speaker bonnet I used

I went with the 3W 4 Ohm speaker set, but it’s a little quiet even at 100%. So I might have to make some changes. I really want to use a much more high powered “glass” speaker that would look really cool, and is water proof but it needs more power.

I haven’t decided if I’ll have voice recognition.

I’m going to play around with Hey-Jetson, but I don’t know how well it works on the Nano. It seems like it was mostly tested on the TX2 which is faster, and doesn’t mention the TX1 at all. The Nano is like half a TX1.

Snips nlu platform might be interesting for your purposes, I personally tried home assistant app.
You can create your own app such that robot will reply according to your intent etc.

Install Snips on Jetson:

Let me know if that works out for you!

Festival is sensitive to the path where you put the voices. I’ve used custom voices with Festival in multiple versions, and have had to move the voices around to get them to be recognized. I haven’t used Festival on the Nano, though. I’ve used it with the Xavier on an earlier Jetpack, and on other Ubuntu and even Arch Linux systems.

Thanks you so much!

I will try in today