![]() |
SuperTinyKernel™ RTOS 1.06.x
Lightweight, high-performance, deterministic, bare-metal C++ RTOS for resource-constrained embedded systems. MIT Open Source License.
|
Lightweight High-Performance Deterministic C++ RTOS for Embedded Systems
SuperTinyKernel™ RTOS (STK) is a lightweight, deterministic real-time operating system for resource-constrained embedded systems. Instead of providing large peripheral abstraction layers (HAL), STK focuses on a highly optimized preemptive scheduler with minimal runtime overhead and a very small memory footprint.
STK combines the control and transparency of bare-metal development with the structure and maintainability of modern, type-safe C++.
You get:
STK does not attempt to abstract or manage MCU peripherals, similarly to FreeRTOS or CMSIS-RTOS2.
STK is an open-source project developed at https://github.com/SuperTinyKernel-RTOS.
| Soft real-time | No strict time slots, mixed cooperative (by tasks) and preemptive (by kernel) scheduling |
| Hard real-time (KERNEL_HRT) | Guaranteed execution window, deadline monitoring by the kernel |
| Static task model (KERNEL_STATIC) | Tasks created once at startup |
| Dynamic task model (KERNEL_DYNAMIC) | Tasks can be created and exit at runtime |
| Rich scheduling capabilities | All major scheduling strategies are supported: priority-less (Round-Robin), fixed-priority, weighted (SWRR), earliest-deadline-first (EDF), and mixed-criticality adaptive (MCAS/MCAS4) |
| Mixed-criticality | MCAS (2-level) and MCAS4 (4-level) adaptive strategies featuring SWRR-based group scheduling, automatic cascade escalation/recovery, and elastic CPU share adaptation driven by per-group EWMA execution-pressure estimation |
| Tick or Tickless modes | Fixed-interval periodic interrupts (Tick) for simplicity, or dynamic timer-based wakeups (Tickless, KERNEL_TICKLESS) to maximize CPU sleep duration and power efficiency |
| Extensible via C++ interfaces | Kernel functionality can be extended by implementing available C++ interfaces |
| Multi-core support (AMP) | One STK instance per physical core for optimal, lock-free performance |
| Memory Protection Unit (MPU) support | Privileged ACCESS_PRIVILEGED and non-privileged tasks ACCESS_USER |
| Low-power aware | MCU enters sleep when no task is runnable (sleeping) |
| Synchronization API | Rich set of primitives in stk::sync namespace |
| Memory API | Deterministic, fragmentation-free allocator in stk::memory namespace |
| Thread-Local Storage (TLS) | Per-task TLS via a dedicated CPU register via inline zero-overhead helpers |
| Tiny footprint | Minimal code unrelated to scheduling |
| Safety-critical systems ready | No dynamic heap memory allocation — a required baseline for IEC 61508 / ISO 26262 / DO-178C certification. See Professional Services for certification support. |
| C++ and C API | Can be used easily in C++ and C projects |
| CMSIS-RTOS2 compatible | Full CMSIS-RTOS2 wrapper (cmsis_os2_stk.cpp) maps the standard ARM CMSIS-RTOS2 C API onto STK, enabling drop-in compatibility with STM32CubeMX, MCUXpresso, and other CMSIS-aware middleware |
| FreeRTOS compatible | Full FreeRTOS wrapper (freertos_stk.cpp) maps the standard FreeRTOS C API onto STK, enabling drop-in migration of existing FreeRTOS codebases with minimal or no application changes |
| Easy porting | Requires very small to none BSP surface |
| Traceable | Scheduling is fully traceable with a SEGGER SystemView |
| Development mode (x86) | Run the same threaded application on Windows |
| 100% test coverage | Every source-code line of scheduler logic is covered by unit tests |
| QEMU test coverage | All repository commits are automatically covered by unit tests executed on QEMU for Cortex-M and RISC-V |
There are several tickless examples:
STK is one of the few lightweight RTOSes that offers all popular switching strategies to match any usage scenario, see stk/strategy for more details.
| Strategy Name | Mode | Description |
|---|---|---|
| SwitchStrategyRoundRobin / SwitchStrategyRR | Soft / HRT | Round-Robin scheduling strategy (Default). Each runnable task receives one time slice per tick in turn. Allows 100% CPU utilization. |
| SwitchStrategySmoothWeightedRoundRobin / SwitchStrategySWRR | Soft / HRT | Smooth Weighted Round-Robin (SWRR). Distributes CPU time proportionally to per-task weights with burst-free interleaving. On each tick: every task's current weight is incremented by its static weight; the task with the highest current weight runs and then has the total weight sum deducted. Includes a wake-up priority boost to prevent I/O-bound task starvation. |
| SwitchStrategyFixedPriority | Soft / HRT | Fixed-Priority Round-Robin. Tasks have fixed priorities (up to 32 levels); same-priority tasks are scheduled in Round-Robin order. Kernel supports Priority Inheritance automatically. Behavior is similar to FreeRTOS's scheduler. |
| SwitchStrategyRM | HRT | Rate-Monotonic (RM). Assigns fixed priorities based on task periodicity — shorter period means higher priority. Optimal among all fixed-priority policies for independent periodic tasks. Includes WCRT schedulability analysis. |
| SwitchStrategyDM | HRT | Deadline-Monotonic (DM). Assigns fixed priorities based on task deadlines — shorter deadline means higher priority. Generalizes RM; optimal when deadlines ≤ periods. Includes WCRT schedulability analysis. |
| SwitchStrategyEDF | HRT | Earliest Deadline First (EDF). Selects the runnable task with the smallest relative deadline (deadline − elapsed_duration) via an O(n) linear scan each tick. Provably optimal for single-processor systems — if a feasible schedule exists, EDF will find it. |
| SwitchStrategyMCAS 🔒 | HRT | Mixed-Criticality Adaptive Scheduler (2-level). SWRR within each criticality group (LO / HI) with automatic escalation to a protected HI-only mode on budget overrun. Commercial License |
| SwitchStrategyMCAS4 🔒 | HRT | Mixed-Criticality Adaptive Scheduler (4-level). Extends MCAS with four criticality levels, cascade escalation/recovery, and elastic CPU share adaptation via per-group EWMA pressure estimation. Commercial License |
| Custom | Soft / HRT | Custom algorithm implemented via the ITaskSwitchStrategy interface. By implementing the ITaskSwitchStrategy interface you can provide your own unique scheduling strategy without changing anything inside the kernel. |
🔒 Commercial strategies are available to commercial licensees. See the bottom of README.md for contact details.
Starting with ARM Cortex-M3 and all newer cores (M3/M4/M7/M33/M55/...) that implement the Armv7-M or Armv8-M architecture with the Memory Protection Unit (MPU), STK supports explicit privilege separation between tasks.
| Access Mode | Privileged (ACCESS_PRIVILEGED) | Unprivileged (ACCESS_USER) |
|---|---|---|
| CPU privilege level | Runs in Privileged Thread Mode | Runs in Unprivileged Thread Mode |
| Direct peripheral access | Allowed (normal register/bit-band access) | Blocked by the hardware (BusFault on any peripheral access) |
| Ability to call SVC / trigger PendSV | Yes | No (but STK services allow Sleep, Delay, Yield, CS, ...) |
| Ability to execute privileged instructions (CPS, MRS/MSR for control regs, etc.) | Yes | No |
| Typical use case | Drivers, hardware abstraction, critical infrastructure code | Application logic, protocol parsers, third-party or untrusted code |
Modern embedded systems increasingly process untrusted or complex data (network/USB packets, sensor data, firmware updates, etc.). By marking tasks that parse potentially attacker-controlled data as ACCESS_USER, you get hardware-enforced isolation:
STK supports multicore embedded microcontrollers (e.g., ARM Cortex-M55, dual-core Cortex-M33/M7/M0, or multicore RISC-V devices) through a per-core instance model (Asymmetric Multi-Processing).
AMP design delivers maximum performance while keeping STK kernel extremely lightweight:
| Feature | Description |
|---|---|
| Zero intercore overhead | No cross-core communication inside STK itself |
| Minimal latency | Scheduling decisions are local to the core |
| Full cache efficiency | All kernel data structures stay in the local core’s L1 cache |
| Independent timing domains | One core can run hard real-time tasks while another runs soft real-time or dynamic tasks |
| Simple and predictable | No complex SMP synchronization logic required in the kernel |
| No core congestion | Highest possible performance and deterministic timing on each individual core |
STK provides a rich set of synchronization primitives (see stk/sync) which are suitable for multicore synchronization with multiple STK instances.
There is a dual-core example for Raspberry Pico 2 W board with RSP2350 MCU in build/example/project/eclipse/rpi/blinky-smp-rp2350w directory.
STK provides a first-class ultra-low power scheduling mode that goes beyond simple tickless idle. By combining KERNEL_TICKLESS with a custom stk::IPlatform::IEventOverrider, the application can select the deepest appropriate hardware sleep state on each idle entry with full kernel suspension and graceful resume support.
When all tasks are sleeping, the kernel calls IEventOverrider::OnSleep(sleep_ticks) instead of spinning. The application overrides this hook to drive the MCU into the most power-efficient state that still guarantees a wake-up within the required deadline.
STK supports ISR-safe kernel suspension, allowing an interrupt handler to pause and resume all task scheduling without race conditions. This is useful for entering a low-activity state (e.g. user-triggered standby) where tasks should be fully frozen until an external event occurs.
A complete ultra-low power demo targeting the STM32F407G-DISC1 board is available:
Note: A minimal set of CMSIS/BSP API is used by STK.
For seamless integration with C projects, STK provides a dedicated, fully-featured C API. See interop/c for the full reference and examples.
Quick example:
STK ships two ready-made compatibility wrappers that let you swap out your existing RTOS backend and replace it with STK without rewriting application code. Both wrappers live under interop/ and share the same design principles: thin translation of the source API onto STK primitives, ISR-safe where the original API requires it, and zero heap usage when the caller supplies static memory.
STK provides a complete CMSIS-RTOS2 compatibility layer (cmsis_os2_stk.cpp) that maps the standard ARM CMSIS-RTOS2 C API (cmsis_os2.h v2.3.0) onto the STK C++ kernel. This allows you to use STK as a drop-in RTOS backend in any project that targets the CMSIS-RTOS2 interface, including code generated by STM32CubeMX, MCUXpresso, or any other CMSIS-aware IDE or middleware stack.
Covered API groups: Kernel Management, Thread Management, Thread Flags, Event Flags, Mutex, Semaphore, Timer, Message Queue, Memory Pool.
Quick integration:
See interop/cmsis/rtos2 for the full API coverage table, design notes, and configuration macros.
STK provides a complete FreeRTOS compatibility layer (freertos_stk.cpp) that maps the standard FreeRTOS C API onto the STK C++ kernel. Existing FreeRTOS projects can migrate to STK with minimal or no changes to application code, while immediately gaining STK's lower scheduling overhead, reduced jitter, and smaller RAM footprint (see Benchmark above).
Covered API groups: Kernel Control, Task Management, Queue, Semaphore / Mutex, Software Timers, Event Groups, Task Notifications.
Quick integration:
See interop/freertos for the full API coverage table, design notes, known limitations, and FreeRTOSConfig.h requirements.
| CMSIS-RTOS2 | FreeRTOS | |
|---|---|---|
| Source file | cmsis_os2_stk.cpp | freertos_stk.cpp |
| Header to include | cmsis_os2.h | FreeRTOS.h, task.h |
| Priority model | Linear map: osPriority 1–56 → STK levels 0–31 | Direct clamp: FreeRTOS 0–N → STK levels 0–31 |
| Max tasks macro | CMSIS_STK_MAX_THREADS (default: 16) | FREERTOS_STK_MAX_TASKS (default: 16) |
| Default stack macro | CMSIS_STK_DEFAULT_STACK_WORDS (default: 256) | FREERTOS_STK_DEFAULT_STACK_WORDS (default: 256) |
| Static allocation | cb_mem / cb_size attributes | xTaskCreateStatic / xQueueCreateStatic |
| Scheduler backend | SwitchStrategyFP32 | SwitchStrategyFP32 |
| Tickless support | STK_TICKLESS_IDLE=1 | STK_TICKLESS_IDLE=1 |
Note: Wrapper is missing important API your project is using? Contact with inquiry: contact@supertinykernel.org
STK provides a feature-rich synchronization API which is located in stk/sync and resides in a dedicated namespace stk::sync. It is a high-performance framework designed for both single-core and multicore embedded systems and provides a robust mechanism for inter-task and inter-core communication.
| Primitive | Description |
|---|---|
| hw::CriticalSection | Low-level primitive (including RAII version hw::CriticalSection::ScopedLock) that ensures atomicity by preventing preemption. Always available and independent of KERNEL_SYNC mode. |
| hw::SpinLock | High-performance non-recursive primitive for short critical sections. A key primitive for inter-core synchronization. Always available and independent of KERNEL_SYNC mode. |
| sync::ScopedCriticalSection | RAII wrapper around hw::CriticalSection (disables interrupts + multicore guard). Used as a building brick by all other stk::sync primitives. Always available, independent of KERNEL_SYNC mode. |
| sync::ConditionVariable | Monitor-pattern signaling used with an IMutex-compatible lock. Atomically releases the lock and blocks the caller; re-acquires it on wake. Supports Wait(), NotifyOne(), and NotifyAll(). |
| sync::Event | Binary state-based signaling object. Supports manual-reset (wake all) and auto-reset (wake one) modes. Also provides Pulse() (Win32-compatible semantics) and non-blocking TryWait(). |
| sync::EventFlags | 32-bit multi-flag synchronization group. Each bit is an independent event; tasks can wait for any one (OPT_WAIT_ANY) or all (OPT_WAIT_ALL) requested bits. Matched bits are auto-cleared on wake; pass OPT_NO_CLEAR to keep them set. ISR-safe Set(), Clear(), and Get(). |
| sync::Mutex | Recursive mutual exclusion primitive. Tracks ownership and recursion depth; the same task may lock multiple times. Ownership transfers directly to the first waiter (FIFO) on Unlock(). |
| sync::RWMutex | Reader-Writer Lock for shared (read) and exclusive (write) access. Implements a Writer Preference policy to prevent writer starvation. Provides RAII guards ScopedTimedLock and ScopedTimedReadMutex. |
| sync::SpinLock | High-performance recursive spinlock for very short critical sections where context-switch overhead is unacceptable. Busy-waits until the lock is free; ISR-unsafe (use hw::CriticalSection from ISR context). |
| sync::Semaphore | Counting semaphore for resource throttling and signaling. Features a Direct Handover policy: Signal() passes the token directly to the first waiting task without touching the internal counter. |
| sync::Pipe | Thread-safe typed FIFO ring buffer for inter-task data passing. Parameterised on element type T and capacity N. Supports blocking and non-blocking single-element and bulk (WriteBulk / ReadBulk) I/O with zero dynamic memory allocation. |
| sync::MessageQueue | Fixed-capacity, fixed-message-size FIFO queue backed by a caller-supplied byte buffer. Parameterised on a byte count rather than a type, making it suitable for C-ABI structs or heterogeneous payloads. |
| Custom | Extensible: any class inheriting from ISyncObject can implement custom synchronization logic integrated with the kernel scheduler. |
Note: Synchronization can be enabled in the kernel selectively by adding KERNEL_SYNC flag. If application does not need sync primitives and KERNEL_SYNC is not set to the kernel then synchronization-related implementation is stripped by the compiler saving FLASH and RAM.
STK provides a deterministic, fragmentation-free memory allocation module located in stk/memory under the stk::memory namespace. It is designed for embedded systems where dynamic heap allocation is undesirable or prohibited by coding standards (e.g. MISRA C++ Rule 18-4-1).
A fixed-size block allocator for scenarios where the same block size is repeatedly allocated and released, such as: packet buffers, sensor records, or message payloads.
Internally the pool maintains an intrusive singly-linked free-list inside the storage array itself. No separate metadata array is required, alloc and free are therefore O(1) with a minimal critical section.
Key properties:
Scheduling can be analyzed with the SEGGER SystemView.
There is a ready to try Blinky example with SEGGER SystemView tracing enabled: build/example/project/eclipse/stm/blinky-stm32f407g-disc1-segger
STK includes a full scheduling emulator for Windows to speed up a prototype development:
STK has been tested on the following development boards:
Note: The list of tested boards does not limit STK’s compatibility. STK does not depend on a specific board and relies only on the underlying CPU architecture. As long as target CPU is supported, STK can be integrated with your hardware platform.
| Coverage | Description |
|---|---|
| Platform-independent code | 100% unit test coverage |
| Platform-dependent code | tested under QEMU for each supported architecture |
Board: STM32F407G-DISC1, MCU: STM32F407VG (Cortex-M4 168MHz)
Update: April 2026
Compiler: GCC 14.2.1 (arm-none-eabi-gcc)
This table compares SuperTinyKernel RTOS v.1.06.0 and FreeRTOS V11.2.0 across two compiler optimization levels: -Os and -Ofast. The workload consists of a CRC32-based synthetic task running across multiple tasks/threads to measure scheduling overhead and timing determinism. Benchmark projects are located in build/benchmark/eclipse and the benchmark suite is located in build/benchmark/perf.
The benchmark suite uses CRC32 hash calculations as the task payload. The score represents the number of CRC32 calculations performed by the task within a fixed time window. A higher score indicates a more efficient scheduler, meaning the tasks have more available CPU time.
| Kernel | Tasks | Opt | Throughput | Average | Jitter | Flash Size | RAM Used |
|---|---|---|---|---|---|---|---|
| STK | 16 | -Ofast | 993,008 | 62,063 | 754 | 24.1 KB | 20.3 KB |
| FreeRTOS | 16 | -Ofast | 966,017 | 60,376 | 909 | 15.0 KB | 22.2 KB |
| STK | 16 | -Os | 752,136 | 47,008 | 425 | 14.9 KB | 20.3 KB |
| FreeRTOS | 16 | -Os | 735,342 | 45,958 | 472 | 12.6 KB | 22.2 KB |
| — | — | — | — | — | — | — | — |
| STK | 8 | -Ofast | 988,862 | 123,607 | 866 | 23.0 KB | 11.4 KB |
| FreeRTOS | 8 | -Ofast | 932,654 | 116,581 | 613 | 14.4 KB | 13.3 KB |
| STK | 8 | -Os | 753,013 | 94,126 | 659 | 14.9 KB | 11.4 KB |
| FreeRTOS | 8 | -Os | 713,292 | 89,161 | 468 | 12.6 KB | 13.2 KB |
| — | — | — | — | — | — | — | — |
| STK | 4 | -Ofast | 989,465 | 247,366 | 742 | 22.1 KB | 6.9 KB |
| FreeRTOS | 4 | -Ofast | 881,082 | 220,270 | 671 | 14.5 KB | 8.8 KB |
| STK | 4 | -Os | 753,459 | 188,364 | 564 | 14.9 KB | 6.9 KB |
| FreeRTOS | 4 | -Os | 673,845 | 168,461 | 510 | 12.6 KB | 8.8 KB |
The fastest way to evaluate STK is to build and run one of the bundled examples on your local machine — no hardware required.
Prerequisites: Git, CMake 3.15+, and either Visual Studio (Windows) or GCC via Eclipse CDT (Windows/Linux/macOS).
You can build and run examples without any hardware on Windows.
with Visual Studio:
with Eclipse CDT:
To import project into Eclipse workspace:
You can use your own tools, below specified tools are just for a quick evaluation of STK's functionality using the provided examples.
For STM32, RPI platforms:
For NXP platforms:
For RISC-V platforms:
If you are targeting only ARM, RISC-V tools are not required.
All examples are located in build/example/project/eclipse folder.
Examples are grouped by platform:
STM32 and Raspberry Pico examples include SDK files located in deps/target folder.
Located in build/example/project/nxp-mcuxpresso folder.
Compatible with:
Below example toggles RGB LEDs on a development board. Each LED is controlled by its own thread, switching at 1s intervals:
You can include STK in your project using git submodule or by copying the source into a libs/ or third_party/ folder:
Add the STK directory and link against STK:
Run your normal build procedure. STK will now be compiled and linked with your project.
STK can be integrated by simply copying its source files from stk/ folder.
This method is suitable for:
From the root of STK repository, copy:
into your project's source tree, for example:
Add the following include path to your project configuration:
In CMake:
In GCC/Makefile:
For example, for ARM Cortex-M4 project:
You must compile STK core sources from:
Minimum required sources:
Example (GCC, ARM Cortex-M MCU):
Build your project normally — STK will now be compiled together with it.
Note: With this method only /stk folder containing STK kernel files is cloned, examples, deps and anything else is omitted.
Porting STK to a new platform is straightforward. Platform-dependent files are located in:
STK's OOD design allows easy extension of its functionality. For example, you can develop and attach your own scheduling algorithm by inheriting ITaskSwitchStrategy interface.
STK is released under the MIT Open Source License.
You may freely use it in projects of any type:
While SuperTinyKernel™ RTOS is provided under the permissive MIT license, we offer dedicated professional services for organizations integrating STK into production-grade, mission-critical, or regulated environments.
For inquiries, contact: contact@supertinykernel.org