Raspberry Pi Pico W Text to Speech Using Wit.ai

by CircuitDigest in Circuits > Raspberry Pi

39 Views, 0 Favorites, 0 Comments

Raspberry Pi Pico W Text to Speech Using Wit.ai

Raspberry-Pi-Pico-Text-to-Speech-Using-Wit.ai.jpg

This project turns a Raspberry Pi Pico W into a talking device using an online text to speech service. The Pico sends text over WiFi, gets back audio, and plays it through a small speaker using an I2S amplifier. The hardware stays simple and the heavy speech processing happens in the cloud.

I built this as a clean, repeatable setup that actually works on a microcontroller with limited memory. If you follow the steps below, you should end up with a Pico that speaks anything you type into the serial monitor.

No theory dumps here. This is mostly wiring, setup, and getting audio out of the board.

What You’re Building

At a high level, this is what’s happening:

  1. The Pico W connects to WiFi
  2. You type text into the serial monitor
  3. That text is sent to Wit.ai
  4. Wit.ai converts it to speech
  5. Audio is streamed back as MP3
  6. The Pico sends audio over I2S
  7. A MAX98357A amplifier drives a speaker

The Pico never tries to generate speech itself. It just moves data around and plays audio.

Supplies

Rpi-Pico-WitAITTS-Component_0.png

Here’s exactly what I used:

  1. Raspberry Pi Pico W
  2. MAX98357A I2S audio amplifier
  3. Small speaker (4 ohm or 8 ohm both work)
  4. Breadboard
  5. Jumper wires
  6. USB cable

Wiring the Hardware

Rpi-Pico-WitAITTS-wiring-Diagram.jpg

This is the part that matters most. If the wiring is wrong, you’ll get silence no matter how perfect the code is.

The MAX98357A is an I2S amplifier, which means it uses three signal lines plus power and ground.

Pico W to MAX98357A connections:

  1. GP18 → BCLK
  2. GP19 → LRC
  3. GP20 → DIN
  4. 5V → VIN
  5. GND → GND

That’s it. No resistors, no level shifting, no extra parts.

After wiring:

  1. Double check GND connections
  2. Make sure VIN is actually 5V, not 3.3V
  3. Make sure the speaker is connected to the amplifier, not directly to the Pico

If you hear popping or distortion later, it’s almost always power or speaker wiring.

Creating a Wit.ai Account

WitAi-Homepage_0.jpg
WitAI-API-Key_2.jpg

You need an online TTS service, and this project uses Wit.ai.

Step 1: Create an account

Go to https://wit.ai and sign up. Email signup is the easiest. Verify your email and log in.

Step 2: Create a new app

Once inside the dashboard:

  1. Click to create a new app
  2. Give it a name
  3. Pick the language you want speech in

The language choice matters. Pick English unless you know you want something else.

Step 3: Get the Server Access Token

Go to:

  1. Management
  2. Settings
  3. HTTP API

Copy the Server Access Token. This is what the Pico uses to authenticate.

Step 4: Save the token

Keep this token safe. If you regenerate it later, you must reupload the sketch with the new token.

Installing the Arduino Environment

You’ll program the Pico W using Arduino IDE.

Make sure you already have:

  1. Arduino IDE installed
  2. Raspberry Pi Pico board support added

Once that’s done, install the library.

Installing the WitAITTS library

  1. Open Arduino IDE
  2. Open Library Manager
  3. Search for WitAITTS
  4. Click Install

When it finishes, you’re ready to use the example sketch.

Opening the Example Sketch

Instead of writing code from scratch, use the provided example.

Go to:

  1. File
  2. Examples
  3. WitAITTS
  4. PicoW_Basic

This opens a working sketch that already handles WiFi, HTTPS, audio streaming, and I2S playback.

Editing the Code

You only need to change a few things.

WiFi credentials

Replace these with your actual network details:


#define WIFI_SSID "YourWiFiSSID"
#define WIFI_PASSWORD "YourWiFiPassword"

Wit.ai token

Paste your Server Access Token here:


#define WIT_TOKEN "YOUR_WIT_AI_TOKEN_HERE"

That’s all you must change.

What the Important Code Lines Do

I’m not going to walk through every line. These are the ones that matter.

Create the TTS engine


WitAITTS tts;

This object handles everything. WiFi, HTTPS, decoding audio, and pushing sound to the amplifier.

Start WiFi and authenticate


tts.begin(WIFI_SSID, WIFI_PASSWORD, WIT_TOKEN);

If this fails, nothing else works. Watch the serial output carefully here.

Set the voice


tts.setVoice("wit$Remi");

This changes how the voice sounds. You can experiment with other voice IDs later.

Control speed and pitch


tts.setSpeed(100);
tts.setPitch(100);

Start with these values. Extreme settings make speech hard to understand.

Speak the text


tts.speak(text);

This sends text to Wit.ai, waits for the audio stream, and plays it immediately.

While audio is playing, the Pico is busy. That’s normal.

Uploading the Sketch

Before uploading:

  1. Click Verify
  2. Make sure there are no compile errors

Now:

  1. Plug in the Pico W
  2. Select the correct board and port
  3. Click Upload

When the upload finishes, open the Serial Monitor.

Testing the Speech Output

Set the Serial Monitor to:

  1. Correct COM port
  2. Newline enabled
  3. Default baud rate from the sketch

You should see:

  1. WiFi connection messages
  2. IP address
  3. WitAITTS configuration info

Now type a sentence and press Enter.

If everything is working:

  1. The serial monitor will say it’s requesting TTS
  2. Audio starts playing almost immediately
  3. The speaker speaks your text

The first request sometimes takes a second longer. After that, it feels fast.

How Audio Streaming Works

Audio comes in as an MP3 stream. It’s not downloaded all at once.

That gives you a few advantages:

  1. Lower memory usage
  2. Faster perceived response
  3. No big buffers needed

Things that affect audio quality:

  1. WiFi stability
  2. Power supply
  3. Speaker quality

If audio cuts out, start by checking power and signal wires.

Common Problems and Fixes

No sound at all

Check these first:

  1. Speaker wired to amplifier, not Pico
  2. Amplifier VIN is 5V
  3. GND is common everywhere
  4. GP18, GP19, GP20 are correct

HTTP 401 error

This means:

  1. Token is wrong
  2. Token was regenerated
  3. Token has extra spaces

Fix it by pasting the token again and reuploading.

Distorted or crackling audio

Usually caused by:

  1. Weak USB power
  2. Bad speaker
  3. Loose jumper wires

Short wires help. Breadboards don’t love high speed audio signals.

Nothing happens after typing text

Check that:

  1. Serial Monitor is set to newline
  2. WiFi actually connected
  3. Text isn’t empty

Final Thoughts

Once it works, you can experiment:

  1. Change voices
  2. Adjust pitch and speed
  3. Cache common phrases
  4. Add buttons instead of serial input
  5. Trigger speech from sensors

You can also store a few important audio files locally as a fallback when WiFi is down.

This setup gives the Pico W a real voice without pushing it beyond what it can handle. All the heavy speech processing stays in the cloud, and the Pico just focuses on control and playback.

If you wire it cleanly and keep your token correct, it’s very reliable. Once you hear it speak for the first time, it’s honestly hard not to start adding it to other projects.


The above is entirely based on: Raspberry Pi Pico Text to Speech using AI