A mid-range wireless security camera costs $50–200 and locks you into a cloud subscription where footage lives on someone else's server. An ESP32-CAM costs $12, fits in your palm, and with a PIR sensor and a Telegram bot becomes a motion-triggered camera that sends photos directly to your phone. No monthly fee, no data going through a third-party aggregator, no cloud lock-in.
The catch is that the ESP32-CAM is a slightly cursed piece of hardware. It has no USB port, the camera pins collide with SPI and SD card pins, the boot mode is finicky, and the GPIOs that are left free are a narrow subset of the chip's full count. This article walks the whole build, including the gotchas that cost first-time users an afternoon.
Bill of materials
- ESP32-CAM module (AI Thinker variant, most common) — $8–12
- FTDI USB-to-TTL adapter (3.3V/5V switchable) for initial flashing — $5
- PIR motion sensor HC-SR501 — $2
- 18650 lithium-ion cell (2500–3400 mAh) — $6
- 18650 holder with TP4056 charger board — $3
- Jumper wires, small enclosure — $3
About $30 total. The only expensive part is the 18650 cell, and a recycled one from a dead laptop battery works fine.
Creating the Telegram bot
Before wiring anything, create a bot. Telegram's bot interface is the simplest push-notification system we know of — no API keys, no server infrastructure, fifteen minutes from zero to working.
- Open Telegram, search for
@BotFather, start a chat. - Send
/newbotand follow the prompts to name your bot and choose a username ending in_bot. - BotFather replies with an HTTP API token: a long string like
1234567890:ABC.... Copy it. - Start a chat with your new bot (search for its username), send any message to it.
- Open
https://api.telegram.org/bot<TOKEN>/getUpdatesin a browser. Look forchat.idin the JSON response — that is your chat ID.
You now have a bot token and a chat ID. The firmware will use these to send photos to your chat.
Wiring, with the required warnings
ESP32-CAM Destination
--------- -----------
GND --- GND (FTDI during flash, then battery GND)
5V --- 5V (FTDI during flash, then battery 5V via TP4056)
U0R (RX) --- FTDI TX
U0T (TX) --- FTDI RX
GPIO 0 --- GND during flashing, DISCONNECT before running
GPIO 13 --- PIR OUT
PIR VCC --- 5V
PIR GND --- GNDThe four things that break a first build:
- GPIO 0 to GND puts the chip in flash mode. Keep it grounded while uploading firmware. Remove the connection before running or the chip just sits in the bootloader.
- The ESP32-CAM has no USB. You need an external USB-to-serial adapter (FTDI FT232RL or CP2102). Make sure it is set to 5V output, not 3.3V — the camera browns out on 3.3V.
- Many GPIOs collide with the camera. Available free pins are essentially GPIO 12, 13, 14, 15, 16 — and some of those are flash strap pins that must be specific levels at boot. GPIO 13 is a safe choice for PIR input.
- Power matters. Peak current during WiFi + camera init is 400–600 mA. A weak USB port or a worn battery will brown out mid-capture. Use a fresh 18650 or a power supply rated for at least 1 A.
How the firmware flows
The device sleeps at ~10 µA almost all of the time. PIR wakes it; it does the work and returns to sleep. Battery life is about the number of motion events, not the calendar.
The firmware
Abbreviated from a full working sketch. Uses the Arduino ESP32 core with the esp_camera driver and the UniversalTelegramBot library.
The full sketch (about 180 lines with camera pin config) is on the Espressif examples repository. This skeleton shows the structure.
Power budget in practice
A 2500 mAh 18650 at 3.7V nominal is 9.25 Wh of energy. The ESP32-CAM draws roughly:
- Deep sleep: 6 mA (unusually high for ESP32 because the camera regulator leaks) – the AI Thinker module is worse than a bare ESP32 in sleep. Can be reduced to 2 mA with a power-switching MOSFET on the camera's 3.3V rail.
- Active WiFi + camera for ~8 seconds per trigger: 400 mA average = 0.89 mAh per trigger.
With 50 triggers per day, that is 45 mAh per day from captures and 144 mAh per day from sleep (at 6 mA). Total ~190 mAh per day. A 2500 mAh battery lasts 13 days. In practice our test builds got 10–14 days between charges with moderate traffic.
Moving the PIR wake to an external power-gate circuit (the PIR cuts power to the ESP32 entirely between events) pushes battery life to a month or more, but adds board complexity.
Gotchas we hit
- Camera init fails silently on low voltage. If your battery is below 3.5 V, the camera returns a generic init error that looks like a wiring problem. Charge first.
- PIR false triggers during sunrise. Changing light temperature fools cheap PIRs. Point the sensor away from windows or add a brief "sanity" delay: on wake, wait 500 ms and confirm PIR is still high before capturing.
- Telegram rate limits. Telegram bots can send about 30 messages per second globally and 1 per second per chat. Fine for a motion camera; problematic if multiple cameras share a chat ID during a busy event. Stagger.
- Images look bad in low light. The OV2640 sensor on the AI Thinker board is adequate in daylight and poor at dusk. For genuine night vision, the ESP32-CAM-MB with IR LEDs or the newer Seeed XIAO ESP32-S3 Sense are better starting points.
- WiFi connection takes 3–5 seconds. Subject may have moved away before the photo sends. Capture first, connect WiFi second (as in the sketch above) rather than the reverse.
Frequently Asked Questions
Can this record video?
Yes, but not on battery. Video requires continuous CPU and camera operation at 100+ mA, which drains a 2500 mAh cell in a day. For continuous recording, power the camera from USB and stream MJPEG to a local NAS or RTSP server.
Why not use a Raspberry Pi with a camera module?
Pi is more capable (real OS, full Python, better camera) but draws 10× the power, costs 5× as much, and takes 30 seconds to boot. For motion-triggered stills, ESP32-CAM is the right tool.
Can I do on-device person detection?
On the AI Thinker board, marginally. The ESP32-S3 variants (Seeed XIAO Sense, ESP-EYE) have enough performance to run TensorFlow Lite Micro with a MobileNet-based person detector at 1–2 FPS. The AI Thinker board technically can but the frame rates are not worth the added firmware complexity.
Is Telegram reliable enough to depend on?
For personal use, yes. Messages arrive within a second or two. For mission-critical applications, layer a secondary channel (email, SMS via Twilio) so you do not depend on a single service. Telegram does occasionally have regional outages.
What about privacy concerns with Telegram?
Images go through Telegram's servers. If that matters, replace the Telegram send with an HTTP POST to your own server, or write to SD card for local-only storage. The architecture is otherwise identical.
Get the complete project package
The article above shows the core firmware and the principles behind it. The complete project package — assembled, tested, and ready to flash — is available by email request. We send it manually, and we read every request.
- Complete Arduino sketch (.ino) with full error handling
- List of required libraries with version numbers
- Printable wiring diagram (PDF)
- Bill of materials with current part numbers
- Build guide and troubleshooting tips
- Configuration template (WiFi, MQTT, etc.)
Share your thoughts
Worked with this in production and have a story to share, or disagree with a tradeoff? Email us at support@mybytenest.com — we read everything.