Embedded

Bare Metal vs RTOS: When to Use Each

There is a moment in most embedded projects when someone asks: should we use an RTOS? The answer often defaults to yes, because bringing in FreeRTOS feels like the grown-up choice and writing a super-loop feels like what you did in college. This is usually backwards.

The right question is not "should we use an RTOS" but "what problem would an RTOS actually solve for us, and is that problem worth the cost we are about to take on?" The answer, for a large number of embedded products, is that bare metal is genuinely the better choice, and people who have been burned by unnecessary RTOS adoption tend to say so loudly for the rest of their careers.

What each approach actually is

Bare metal

Bare metal firmware runs directly on the hardware with no operating system in between. The CPU boots, your reset handler runs, you initialise the hardware, and then your code does its thing — typically in a main loop that polls things, services interrupt flags, and calls handlers. There is no scheduler. If the main loop hangs, nothing else runs.

The two common structural patterns are the super-loop (a big while(1) with everything inside) and the interrupt-driven state machine (interrupts do the urgent work, the main loop polls for state changes). In practice most real firmware is a hybrid.

RTOS

A real-time operating system provides a scheduler, task abstractions, and inter-task communication primitives (queues, semaphores, mutexes). You divide your work into tasks, assign each a priority, and let the kernel decide which runs when.

"Real-time" here means the scheduler has bounded, predictable latency — not that it is fast. A good RTOS can guarantee that a high-priority task will start executing within microseconds of its trigger. Popular choices: FreeRTOS (the most widely deployed), Zephyr (growing fast, Linux Foundation project), ChibiOS, ThreadX (now part of Microsoft).

The case for bare metal

Predictability

When there is no scheduler, there is no scheduler to misconfigure. The code runs in exactly the order you wrote it, interrupts fire where you enabled them, and the entire system behaviour fits in one person's head. For safety-critical code, this simplicity is a virtue, not a limitation.

Minimal overhead

An RTOS kernel needs a few kilobytes of RAM per task for its stack, plus a fixed kernel footprint. A bare metal application pays none of that. On an 8 KB RAM part, this matters. On a 2 KB RAM part, it is often the difference between feasible and not.

Simpler debugging

A crashed bare metal program has one thing to debug: your code. A crashed RTOS program might be a deadlock, a priority inversion, a stack overflow in the wrong task, a race between two tasks accessing shared state, or a kernel bug. The debugging surface area is multiplicatively larger.

No concurrency taxes

In a bare metal program, data structures are not shared across tasks because there are no tasks. You do not need mutexes, you do not need atomics, you do not need to reason about partial updates. The only concurrency issue is interrupts vs main loop, which you manage with volatile and interrupt disabling.

The case for an RTOS

Genuine concurrency

If your product has five things happening simultaneously — reading a sensor, driving a display, handling a BLE connection, managing a charging circuit, responding to a button — structuring that as five tasks is genuinely cleaner than structuring it as a single state machine. The state-machine approach is possible, but past a certain complexity it becomes unreadable.

Mature inter-task communication

RTOS queues, semaphores, event groups, and mutexes are battle-tested primitives. Rolling your own ring buffer with interrupt-safe enqueue is something every embedded engineer eventually does, and every one of us has at least one subtle bug to our name. The RTOS version has been debugged by tens of thousands of users.

Pre-emptive priorities

The killer feature of an RTOS is pre-emption: a high-priority task interrupts a low-priority one. In a bare metal super-loop, a long-running operation in the main loop blocks everything else until it finishes. In an RTOS, the critical task preempts whatever was running and responds immediately.

Team scaling

Independent tasks can be owned by independent engineers. In a large firmware codebase with multiple subsystems (comms, UI, storage, sensors) each developed in parallel, an RTOS gives a natural boundary between them. The bare metal alternative tends to produce a main loop that touches every subsystem and becomes a coordination bottleneck.

Where each goes wrong

The failure modes are instructive, because they are usually the opposite of the reason the choice was made.

Bare metal failing

A bare metal project usually fails when the main loop gets so tangled with timing dependencies that nobody can add a feature without breaking something. The symptoms: a 500-line main function, comments like // DO NOT TOUCH, breaks Bluetooth, and engineers afraid to refactor.

The fix is rarely "add an RTOS" — by this point the codebase has no task boundaries to migrate onto. The fix is usually "refactor into state machines with clear inputs and outputs". Do that cleanly and you may not need an RTOS at all.

RTOS failing

An RTOS project usually fails in one of three ways. First, priority inversion: a low-priority task holds a resource a high-priority task needs, and a medium-priority task preempts the low-priority one, blocking the high-priority task indefinitely. Second, stack overflow: a task's stack is sized too small, it silently corrupts another task's stack, and the system hangs hours later with no apparent cause. Third, deadlock: task A holds mutex X waiting for Y, task B holds Y waiting for X.

None of these exist in bare metal. All are routinely debugged in RTOS projects and are why RTOS firmware is harder to reason about than it looks.

A decision framework

We make this call by asking four questions:

1. How much RAM do you have?

Under 8 KB: bare metal, almost certainly. Between 8 and 32 KB: borderline, depends on concurrency needs. Over 32 KB: an RTOS is comfortable.

2. How many genuinely independent activities are running?

One or two: bare metal wins on simplicity. Three to five: either can work, RTOS starts to pay off. Six or more: an RTOS is probably the right answer.

3. What are the timing requirements?

If a single task has a hard real-time deadline and nothing else matters much, bare metal with an interrupt handler beats an RTOS. If multiple tasks have soft real-time deadlines and you need to meet all of them most of the time, an RTOS scheduler is exactly what you want.

4. What is the team size?

Solo engineer or pair: bare metal keeps the cognitive load low. Team of five or more: an RTOS gives you task boundaries that reduce coordination overhead.

flowchart TD Start([Starting a new firmware project]) --> RAM{RAM budget?} RAM -->|< 8 KB| Bare[Bare metal] RAM -->|8-32 KB| Tasks{Independent
activities?} RAM -->|> 32 KB| Timing{Hard real-time
needed?} Tasks -->|1-2| Bare Tasks -->|3-5| Coop[Cooperative scheduler
or RTOS] Tasks -->|6+| RTOS[RTOS] Timing -->|Single critical task| Hybrid[RTOS + interrupt handlers
for critical work] Timing -->|Multiple soft deadlines| RTOS Timing -->|None| RTOS style Bare fill:#fef3c7,stroke:#92400e,color:#451a03 style Coop fill:#e0e7ff,stroke:#3730a3,color:#1e1b4b style RTOS fill:#dbeafe,stroke:#1e40af,color:#0c1e3b style Hybrid fill:#dbeafe,stroke:#1e40af,color:#0c1e3b

Decision flow for choosing between bare metal, a cooperative scheduler, and a full RTOS.

The third option: a simple cooperative scheduler

If you want task-like structure without the full weight of an RTOS, a cooperative scheduler is a middle ground worth knowing about. You define a set of functions that each run to completion, and a scheduler loops through them in priority order. No preemption, no stacks per task, no kernel — but the mental model of "independent tasks" is preserved.

Protothreads (by Adam Dunkels) is the classic implementation: lightweight, stackless, written in ~50 lines of C macros. Contiki OS is built on this idea. Many shipping products use something similar without calling it a scheduler.

This option fits the surprisingly large space between "super-loop is getting messy" and "we need FreeRTOS". Before committing to an RTOS, try refactoring into cooperative tasks. Many projects stop there.

Frequently Asked Questions

Is FreeRTOS really free for commercial products?

Yes. FreeRTOS is MIT-licensed, which is as permissive as it gets. You can ship it in a commercial product without paying, without open-sourcing your application code, and without any attribution requirements beyond the source file header.

How much RAM does FreeRTOS need?

The kernel itself is around 5-10 KB of flash. Each task needs its own stack, typically 256 bytes minimum, more commonly 512-1024 bytes. A minimal system with three tasks can fit in 8 KB of RAM, but you will be counting bytes. 16 KB is a more realistic floor.

Should I use Zephyr instead of FreeRTOS for new projects?

If you are starting fresh on a 32-bit part with reasonable resources, Zephyr is worth a serious look. It ships with drivers, networking, and Bluetooth stacks out of the box. The cost is a steeper learning curve and a heavier footprint. FreeRTOS is simpler, smaller, and still the default for many product teams.

Can I mix bare metal with an RTOS?

Yes, and it is common. Time-critical work (motor control, DSP) runs in interrupt handlers that the RTOS does not schedule. Everything else runs as RTOS tasks. The hybrid is often the best of both worlds.

Share your thoughts

Worked with this in production and have a story to share, or disagree with a tradeoff? Email us at support@mybytenest.com — we read everything.