Build a simple voice trigger system on your Raspberry Pi

What You'll Need

Hardware

Raspberry Pi 4 or 5 (flashed and on your network)
USB microphone
USB speaker
Power supply
microSD card

Software / Accounts

Viam account
Python 3.9+ with the Viam Python SDK

Before you begin: Set up your Pi

Flash Raspberry Pi OS, get your Pi online, and install viam-server on it so it shows up in your Viam app. Follow the Viam setup guide end to end before continuing.

Step 1: Plug in your USB mic and speaker

Connect the USB microphone and USB speaker to any of the Pi's USB ports.

Step 2: Add the discovery service to find the speaker and mic

In your machine's CONFIGURE tab, click + → Configuration block and search for audio discovery. Add the system-audio/discovery module and name it audio-discovery. Viam will install the supporting module automatically.

Open the service's Test panel. You’ll see a list of audio inputs connected to the Pi (both speakers and mics). Click Add component on the ones corresponding to your specific speaker and mic combination to add them to your machine as mic and speaker.

Wake word detection currently only works for mics with num_channels=1 and sample_rate=16000. Make those changes in the Attributes section.

Step 3: Add and configure the wake word filter service

Click + → Configuration block again and search for wake word. Pick filtered-audio/wake-word-filter — it's an AUDIO_IN module that wraps your microphone and only emits audio when a wake word is detected.

Name the service wake-word and paste this into its Attributes panel:

{
  "source_microphone": "mic",
  "wake_words": [
    "hello",
    "hola"
  ]
}

The source_microphone points at the microphone component from Step 2, and wake_words is the list of phrases that will trigger capture. Add or swap phrases to taste. Our Alexa is friendly, so it answers to both English and Spanish greetings.

Step 4: Write control logic to act on the wake word

Create main.py on your dev machine. The script connects to your Pi, streams from the wake-word audio input, and pipes each detected segment straight into the speaker. While a segment plays, it pauses detection so the Pi doesn't hear its own voice and loop forever.

import asyncio

from viam.robot.client import RobotClient
from viam.components.audio_in import AudioIn
from viam.components.audio_out import AudioOut


async def play_segment(wake_word: AudioIn, speaker: AudioOut, audio_data: bytes, audio_info):
    await wake_word.do_command({"pause_detection": None})
    try:
        await speaker.play(audio_data, audio_info)
    finally:
        await wake_word.do_command({"resume_detection": None})


async def main():
    opts = RobotClient.Options.with_api_key(
        api_key='API_KEY_HERE',
        api_key_id='API_KEY_ID_HERE',
    )

    async with await RobotClient.at_address("your-machine-address.viam.cloud", opts) as machine:
        wake_word = AudioIn.from_robot(machine, "wake-word")
        speaker = AudioOut.from_robot(machine, "speaker")

        print("Listening for wake word...")
        while True:
            try:
                audio_stream = await wake_word.get_audio("pcm16", 0, 0)

                segment = bytearray()
                segment_info = None

                async for chunk in audio_stream:
                    audio_data = chunk.audio.audio_data

                    if len(audio_data) == 0:
                        if segment:
                            asyncio.create_task(
                                play_segment(wake_word, speaker, bytes(segment), segment_info)
                            )
                            segment.clear()
                            segment_info = None
                            print("Listening for wake word...")
                    else:
                        if segment_info is None:
                            segment_info = chunk.audio.audio_info
                        segment.extend(audio_data)

            except asyncio.CancelledError:
                return
            except Exception as e:
                print(f"Stream error: {e}")


if __name__ == "__main__":
    asyncio.run(main())

Grab the API key and key ID at CONNECT → API keys and the machine address at CONNECT → Connection Details. Drop them into their respective places at the top of main().

Step 5: Run the script and talk to the Pi

From the folder containing main.py, create a virtual environment, install viam-sdk, and run:

python main.py

You'll see:

INFO    viam.rpc.dial (dial.py:336)
Listening for wake word...

Say "hello" (or "hola") into the mic. The wake-word filter captures your phrase, playback kicks in through the speaker, and once the segment finishes the script goes right back to listening.

What's next?

Audio In API — stream and process mic audio in your own code
Audio Out API — play generated or recorded audio through any speaker

Start building at app.viam.com.