An ESP32 has just enough horsepower to decode an MP3 stream in real time and a beefy enough I2S peripheral to drive an audio DAC at 44.1 kHz / 16-bit. The audio quality is real — better than most kitchen-radio Bluetooth speakers, and you control exactly which stations show up. No subscription, no app, no "sign in to your account" on every reboot.
How it works
MP3 stream URL] -->|HTTP| ESP32 ESP32[[ESP32]] -->|decoded PCM
I2S 44.1 kHz| DAC[MAX98357 DAC
+ amp combo] DAC --> Spk[3W full-range speaker] Encoder[Rotary encoder
+ pushbutton] --> ESP32 ESP32 --> OLED[SSD1306 OLED
now playing]
Streaming flow. The MAX98357 is a combined DAC + class-D amp that drives a small speaker directly without a separate amplifier board.
Bill of materials
- ESP32 dev board (any with PSRAM ideally; ESP32-WROVER is best) — $9
- MAX98357 I2S DAC + amp module — $5
- 3 W / 4 Ω full-range speaker — $4
- Rotary encoder with pushbutton — $2
- 0.96" SSD1306 OLED — $4
- 5 V / 1 A wall adapter — $4
- Wood or 3D-printed enclosure — $5
Total around $33. PSRAM helps because MP3 buffer benefits from extra RAM during long bursts; without it you may get audio dropouts.
Wiring
MAX98357 ESP32
LRC --- GPIO 25
BCLK --- GPIO 27
DIN --- GPIO 26
GND --- GND
VIN --- 5V (Vin)
GAIN --- (leave floating for 9 dB)
SD --- (leave floating for stereo mix)
+ --- speaker +
- --- speaker -
Rotary ESP32
A --- GPIO 32
B --- GPIO 33
SW --- GPIO 34
GND --- GND
OLED ESP32
SDA --- GPIO 21
SCL --- GPIO 22
VCC --- 3.3V
GND --- GNDThe MAX98357 is robust — wires up to 30 cm work fine. Use bigger speaker wires (22 AWG) so the amp output isn't restricted by thin jumpers.
Firmware design
The audio library schreibfaul1's ESP32-audioI2S handles the heavy lifting: HTTP GET on the stream URL, MP3 decoding, PCM output to I2S. We mostly call audio.connecttohost("http://stream-url.example/listen.mp3"). The library calls a callback with the metadata string (now-playing artist + track) which we display on the OLED.
The rotary encoder uses interrupt-on-change for both A and B pins, decoding quadrature in software. Each detent advances the station index by 1 (or back); pressing the encoder switch acts as a play/pause toggle.
Going further
- Multiple speakers in different rooms, all playing the same station — multicast UDP audio over the LAN.
- A tiny USB rechargeable battery + boost converter makes it portable.
- Add Spotify Connect via librespot-esp32 (much harder; needs serious flash space).
Key code: main loop
This is the heart of the firmware, taken from the working sketch. The complete file (with config template, library list, and the rest of the helpers) is around 102 lines and is included in the downloadable project package — request it via the form below.
void loop() {
audio.loop();
if (encDelta != 0) {
currentStation = (currentStation + (encDelta > 0 ? 1 : N_STATIONS - 1)) % N_STATIONS;
encDelta = 0;
nowPlaying = "";
if (isPlaying) connectStation();
drawOled();
}
if (digitalRead(ENC_SW) == LOW) {
delay(50); // debounce
if (digitalRead(ENC_SW) == LOW) {
isPlaying = !isPlaying;
if (isPlaying) connectStation();
else audio.stopSong();
while (digitalRead(ENC_SW) == LOW) delay(10);
}
drawOled();
}
}Get the complete project package
The article above shows the core firmware and the principles behind it. The complete project package — assembled, tested, and ready to flash — is available by email request. We send it manually, and we read every request.
- Complete Arduino sketch (.ino) with full error handling
- List of required libraries with version numbers
- Printable wiring diagram (PDF)
- Bill of materials with current part numbers
- Build guide and troubleshooting tips
- Configuration template (WiFi, MQTT, etc.)
Share your thoughts
Worked with this in production and have a story to share, or disagree with a tradeoff? Email us at support@mybytenest.com — we read everything.