Jetson AI Lab - Home Assistant Integration

it seems that piper in fact does use the gpu correctly :

i’ve managed actually to launch an automation

Edit :
ok, i realise that the automation is triggered by the activation of the assist-microphone and not the content of my voice and each time i call nabu now it would turn of or on the light my sons found it funny but we’re not yet there :)

Hi @notben,

Thank you for trying this port! I’m happy to hear it worked on your device—great job! If you have any questions or suggestions, feel free to reach out here. If you are available, you can also join our our next Jetson AI Research Lab meeting on May 15th. There, you can stay updated on the progress of this project or simply share your experiences.

Updates

I’m revisiting my previous experiments with the Home Assistant Supervisor container to simplify the onboarding process for Home Assistant and Wyoming add-ons on Jetson devices. This effort is based on insights from the latest Jetson AI Research Group meeting.

You can track progress on this PR; it may take a while, so stay tuned! ;)

hello @narandill,

do you’ve any clue why the piper work from the web interface while testing in HA config but when it answer from the all pipeline with wyoming it’s like the voice goes super fast and is unbearable. i can just see in the log that it does not catch what i’m saying and answer it. i’ve tryied to change voice and volume level etc … but it’s always the same effect.

Edit:
thanks for the invite i’ll see if i can make it, it seems very intresting ideed.

@notben Let’s debug the issue step by step. Could you please provide the following information to start:

  1. How are you running the Home Assistant and Wyoming containers? – Are you using the docker-compose.yaml file from the jetson-containers repository, or are you using a different setup?
  2. Audio Device configuration – What value did you assign to the SATELLITE_AUDIO_DEVICE environment variable in the wyoming-assist-microphone container?
    The default value from the mentioned docker-compose.yaml file is plughw:CARD=S330,DEV=0, but this may not be suitable for your hardware configuration. If you configured the variable according to this tutorial, the audio distortions should not occur.
  3. Are you running the containers in a virtual machine? There may be untested issues with running them in a virtualized environment.
  1. → yes

root@jetson:/docker/compose# docker compose images
CONTAINER REPOSITORY TAG IMAGE ID SIZE

assist-microphone dustynv/wyoming-assist-microphone r36.2.0 a8f11b772802 1.08GB
faster-whisper dustynv/wyoming-whisper r36.2.0 0869f969c10b 10GB
home-assistant dustynv/homeassistant-core 2024.4.2-r36.2.0 46533b619629 3.18GB
openwakeword dustynv/wyoming-openwakeword r36.2.0 a63c1a827660 993MB
piper-tts dustynv/wyoming-piper r36.2.0 619a537fc0bc 17.4GB

2 )
i’ve followed to doc on the Assist-microphone container where i could list my audio device and then changed it with this :

AUDIO_DEVICE: “plughw:CARD=USB,DEV=0”

it’s only in that container that the value exist and it looks like it work when i call like " ok Nabu" it’s detected as i receive back a little bip.

i’m using a jabra USB that looks perfectly supported on the linux side.

i think the issue reside more with the piper-tts were i’ve the following config ( default from the composer):

piper-tts:
image: dustynv/wyoming-piper:r36.2.0
restart: unless-stopped
network_mode: host
runtime: nvidia
container_name: piper-tts
hostname: piper-tts
init: false
ports:
- “10200:10200/tcp”
devices:
- /dev/snd:/dev/snd
- /dev/bus/usb
volumes:
- /etc/localtime:/etc/localtime:ro
- /etc/timezone:/etc/timezone:ro
stdin_open: true
tty: true

as i’ll be away for the week-end no need to hurry into the troubleshooting but i would love to find out what going on i can forward you what ever logs you would need or even do a quick session sharing if you’re intrested.

  1. all of this run on my jetson orin 3011 no VM in the midle

@notben please take another look on the Determine Audio Devices section in docs, maybe you started with the older version, but around a week ago I updated the docs to current state where you configure your audio device using SATELLITE_AUDIO_DEVICE environment variable, not AUDIO_DEVICE. It’s also worth to choose Hardware device with all software conversions ;)

Edit: Checked that, compose example was outdated a bit. Created PR to fix the docker-compose.yaml example.

hello,
this time i got the config right :


assist-microphone:
image: dustynv/wyoming-assist-microphone:latest-r36.2.0
restart: unless-stopped
network_mode: host
container_name: assist-microphone
hostname: assist-microphone
runtime: nvidia
init: false
ports:
- “10700:10700/tcp”
devices:
- /dev/snd:/dev/snd
- /dev/bus/usb
volumes:
- assist_microphone_share:/share
- /etc/localtime:/etc/localtime:ro
- /etc/timezone:/etc/timezone:ro
environment:
SATELLITE_AUDIO_DEVICE: “plughw:CARD=USB,DEV=0”
SATELLITE_SND_VOLUME_MULTIPLIER: 0.3
WAKEWORD_NAME: ok nabu
ASSIST_PIPELINE_NAME: “Home Assistant”


and i see in the log that it looks like ok and it’s taken in acount :


assist-microphone | DEBUG:root:Namespace(mic_uri=None, mic_command=‘arecord -D plughw:CARD=USB,DEV=0 -r 16000 -c 1 -f S16_LE -t raw’, mic_command_rate=16000, mic_command_width=2, mic_command_channels=1, mic_command_samples_per_chunk=1024, mic_volume_multiplier=1.0, mic_noise_suppression=0, mic_auto_gain=0, mic_seconds_to_mute_after_awake_wav=0.5, mic_no_mute_during_awake_wav=False, mic_channel_index=None, snd_uri=None, snd_command=‘aplay -D plughw:CARD=USB,DEV=0 -r 16000 -c 1 -f S16_LE -t raw’, snd_command_rate=16000, snd_command_width=2, snd_command_channels=1, snd_volume_multiplier=0.3, wake_uri=‘tcp://127.0.0.1:10400’, wake_word_name=[[‘ok nabu’, ‘Home Assistant’]], wake_command=None, wake_command_rate=16000, wake_command_width=2, wake_command_channels=1, wake_refractory_seconds=5.0, vad=False, vad_threshold=0.5, vad_trigger_level=1, vad_buffer_seconds=2, vad_wake_word_timeout=5.0, event_uri=None, startup_command=None, detect_command=None, detection_command=None, transcript_command=None, stt_start_command=None, stt_stop_command=None, synthesize_command=None, tts_start_command=None, tts_stop_command=None, tts_played_command=None, streaming_start_command=None, streaming_stop_command=None, error_command=None, connected_command=None, disconnected_command=None, awake_wav=‘/usr/src/sounds/awake.wav’, done_wav=‘/usr/src/sounds/done.wav’, uri=‘tcp://0.0.0.0:10700’, name=‘assist microphone’, area=None, no_zeroconf=True, zeroconf_name=None, zeroconf_host=None, debug_recording_dir=None, debug=True, log_format=‘%(levelname)s:%(name)s:%(message)s’)
assist-microphone | INFO:root:Ready
assist-microphone | DEBUG:root:Connecting to mic service: [‘arecord’, ‘-D’, ‘plughw:CARD=USB,DEV=0’, ‘-r’, ‘16000’, ‘-c’, ‘1’, ‘-f’, ‘S16_LE’, ‘-t’, ‘raw’]
assist-microphone | DEBUG:root:Connecting to snd service: [‘aplay’, ‘-D’, ‘plughw:CARD=USB,DEV=0’, ‘-r’, ‘16000’, ‘-c’, ‘1’, ‘-f’, ‘S16_LE’, ‘-t’, ‘raw’]
assist-microphone | DEBUG:root:Connecting to wake service: tcp://127.0.0.1:10400
assist-microphone | INFO:root:Connected to services
assist-microphone | Recording raw data ‘stdin’ : Signed 16 bit Little Endian, Rate 16000 Hz, Mono
faster-whisper | s6-rc: info: service legacy-cont-init successfully started
faster-whisper | s6-rc: info: service whisper: starting

but then some error i’m not sure are serious .


assist-microphone | ERROR:root:Unexpected error in wake read task
assist-microphone | Traceback (most recent call last):
assist-microphone | File “/usr/local/lib/python3.11/dist-packages/wyoming_satellite/satellite.py”, line 711, in _wake_task_proc
assist-microphone | await wake_client.connect()
assist-microphone | File “/usr/local/lib/python3.11/dist-packages/wyoming/client.py”, line 73, in connect
assist-microphone | self._reader, self._writer = await asyncio.open_connection(
assist-microphone | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
assist-microphone | File “/usr/lib/python3.11/asyncio/streams.py”, line 47, in open_connection
assist-microphone | transport, _ = await loop.create_connection(
assist-microphone | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
assist-microphone | File “/usr/lib/python3.11/asyncio/base_events.py”, line 1073, in create_connection
assist-microphone | raise exceptions[0]
assist-microphone | File “/usr/lib/python3.11/asyncio/base_events.py”, line 1058, in create_connection
assist-microphone | sock = await self._connect_sock(
assist-microphone | ^^^^^^^^^^^^^^^^^^^^^^^^^
assist-microphone | File “/usr/lib/python3.11/asyncio/base_events.py”, line 964, in _connect_sock
assist-microphone | await self.sock_connect(sock, address)
assist-microphone | File “/usr/lib/python3.11/asyncio/selector_events.py”, line 633, in sock_connect
assist-microphone | return await fut
assist-microphone | ^^^^^^^^^
assist-microphone | File “/usr/lib/python3.11/asyncio/selector_events.py”, line 668, in _sock_connect_cb
assist-microphone | raise OSError(err, f’Connect call failed {address}')
assist-microphone | ConnectionRefusedError: [Errno 111] Connect call failed (‘127.0.0.1’, 10400)
assist-microphone | DEBUG:root:Connected to mic service


nevertheless the ok nabu word is recognised and it trigger the process :


openwakeword | DEBUG:wyoming_openwakeword.handler:Client connected: 3656324960570
openwakeword | DEBUG:wyoming_openwakeword.handler:Sent info to client: 3656324960570
openwakeword | DEBUG:wyoming_openwakeword.handler:Client disconnected: 3656324960570
openwakeword | DEBUG:root:Triggered ok_nabu_v0.1 (client=3597130893518)
assist-microphone | DEBUG:root:Detection(name=‘ok_nabu_v0.1’, timestamp=3681256823246)
assist-microphone | DEBUG:root:Streaming audio
assist-microphone | DEBUG:root:Event(type=‘run-pipeline’, data={‘start_stage’: ‘asr’, ‘end_stage’: ‘tts’, ‘restart_on_end’: False, ‘name’: ‘Home Assistant’, ‘snd_format’: {‘rate’: 16000, ‘width’: 2, ‘channels’: 1}}, payload=None)
assist-microphone | DEBUG:root:Muting microphone for 0.8995625 second(s)
assist-microphone | DEBUG:root:Connected to snd service
faster-whisper | DEBUG:wyoming_faster_whisper.handler:Language set to en
assist-microphone | Playing raw data ‘stdin’ : Signed 16 bit Little Endian, Rate 16000 Hz, Mono
assist-microphone | DEBUG:root:Unmuted microphone
faster-whisper | DEBUG:wyoming_faster_whisper.handler:Audio stopped. Transcribing with initial prompt=
faster-whisper | INFO:faster_whisper:Processing audio with duration 00:02.450
faster-whisper | DEBUG:faster_whisper:Processing segment at 00:00.000
faster-whisper | INFO:wyoming_faster_whisper.handler: Turn the light off.
faster-whisper | DEBUG:wyoming_faster_whisper.handler:Completed request
assist-microphone | DEBUG:root:Event(type=‘transcript’, data={‘text’: ’ Turn the light off.'}, payload=None)
assist-microphone | INFO:root:Waiting for wake word
assist-microphone | DEBUG:root:Connected to snd service
openwakeword | DEBUG:root:Loading ok_nabu_v0.1 from /usr/local/lib/python3.11/dist-packages/wyoming_openwakeword/models/ok_nabu_v0.1.tflite
assist-microphone | Playing raw data ‘stdin’ : Signed 16 bit Little Endian, Rate 16000 Hz, Mono
openwakeword | DEBUG:wyoming_openwakeword.handler:Started thread for ok_nabu_v0.1
assist-microphone | DEBUG:root:Event(type=‘synthesize’, data={‘text’: ‘Sorry, I am not aware of any light in the living_room area’, ‘voice’: {‘name’: ‘en_GB-alan-low’}}, payload=None)
assist-microphone | DEBUG:root:Connected to snd service
assist-microphone | Playing raw data ‘stdin’ : Signed 16 bit Little Endian, Rate 16000 Hz, Mono


ok looks like the assistant could not understand what is said but i could see sometime in the logs that it did.


i’m really not sure where i got it wrong…

See the small video i did :

@narandill if you’ve any clue what i did wrong let me know it might be the ha config i’ll keep looking when i’ve a bit of time