5 Hidden STM32 Features That Even Experienced Developers Miss

Question

STM32 MCUs hide surprisingly useful hardware features that can save you CPU cycles, lower power, make multi-core life easier, and fix intermittent bugs — if you know they exist. This article reveals 5 “hidden” STM32 features (with concrete examples and code) that even experienced MCU engineers often forget, and exactly how to use them safely in production.
Feature 1 — Fast-mode Plus I²C (FM+, up to 1 MHz)
Fast-mode Plus (FM+) is an I²C mode that supports communication speeds up to 1 MHz, which is faster than Standard mode (100 kHz) and 2× faster than normal Fast-mode (400 kHz).
Why does FM+ exist?
Standard 400 kHz I²C couldn’t keep up with:

High-speed sensors

High-refresh displays

Real-time control loops

Large configuration registers in ICs

Interrupt-driven or polling-heavy systems

FM+ allows faster transfers without switching to SPI.
Why is FM+ “special” in STM32?
Most STM32 MCUs have:

Dedicated FM+ pins (because they need stronger current capability)

Hardware filters designed to support 1 MHz edges

Automatic glitch filtering updated for high-speed modes

Example:PA11/PA12 on STM32F4 or PB6/PB7 on STM32G4 often support FM+.
Quick code hint (conceptual):
// Pseudocode: enable FM+ in SYSCFG (family-dependent)
SYSCFG->CFGR1 |= SYSCFG_CFGR1_I2C_FMP; // family-specific bit; check datasheet
// Then configure I2C timing for ~1MHz using CubeMX timing calculator or AN4235
 
Feature 2 — Hardware Semaphores (HSEM) — great for multi-core STM32
Hardware Semaphores are special hardware registers that act like locks.Two cores (e.g., Cortex-M7 and Cortex-M4 in STM32H7) can use these locks to coordinate access to:

Shared peripherals

Shared RAM regions

Shared communication buffers

Shared interrupt events

Boot coordination

This prevents race conditions and makes multi-core programming reliable.
Why do we need Hardware Semaphores?
In dual-core STM32 MCUs:

Both cores can read/write memory

Both cores can access I²C, SPI, GPIO

Both cores can modify shared buffers

Without protection, both cores may:

Write to the same memory at the same time

Configure a peripheral at the same time

Interrupt each other unexpectedly

Cause corrupted data, freezes, or hard faults

A hardware semaphore ensures that only ONE core owns a resource at a time.
Example Pattern:
// take: write 0x1 to HSEM_SEMIDx to lock if free
if (HSEM->RLR[semid] == 0) {
 HSEM->R[semid] = LOCK_VALUE; // 2-step or 1-step lock depends on family
}
 
Feature 3 — Burst DMA
Burst DMA means the DMA controller transfers multiple data items in one go (a burst) instead of transferring them one-by-one.
Why bursts are faster?
Because each DMA transfer requires:

an address fetch

a bus request

a bus arbitration

a memory access

A burst packs many of these into one bus transaction.
Typical burst sizes:

Single (1 beat)

INCR4 (4 beats)

INCR8 (8 beats)

INCR16 (16 beats)

Benefits:

Higher memory bandwidth

Lower bus overhead

Perfect for high-speed peripherals

Reduces CPU contention on the AHB/AXI bus

When to use Burst DMA?

Large data arrays

Audio buffers

Image frames

ADC multi-channel sequences

SPI/I2S streaming

Memory-to-memory copies

Feature 4 — DMA FIFO Mode
FIFO = First-In-First-Out buffer inside the DMA (up to 4 words depth, depending on MCU).
The FIFO acts like a mini-cache, allowing the DMA to:

accumulate data before writing

pack bursts more efficiently

avoid misaligned transfers

avoid bus stalls

adapt to different memory widths

Modes:

Direct mode (FIFO disabled)

FIFO mode enabled

FIFO threshold: ¼, ½, ¾, full

Why FIFO is useful?
Because without it, the DMA must transfer each element immediately — inefficient for:

different source/destination widths (e.g., 8→32 bit)

slow peripheral buses

high burst rates

FIFO example:
ADC outputs 16-bit.Memory needs 32-bit aligned writes.
Without FIFO → problems.With FIFO → DMA packs data correctly and efficiently.
 
Feature 5 — Double-Buffer Mode (Ping-Pong Mode)
Double-buffer mode allows DMA to use two memory buffers:

Buffer A

Buffer B

While DMA is filling one buffer, the CPU can process the other.
This solves a major real-time problem:no data loss during processing.
Perfect for:

Audio or voice streaming

Continuous ADC sampling

UART RX with high throughput

Real-time DSP

High-speed USB

Camera or video frame capture

Example:

DMA fills Buffer A

CPU processes Buffer B

DMA finishes → buffer swap

CPU now processes A

DMA now fills B

This is called ping-pong or circular double-buffering.
 
How these 3 features work together
The most powerful DMA setup is:
Burst + FIFO + Double-buffer
Example use cases:

High-speed ADC sampling → DSP pipeline

Ethernet RX/TX descriptors

I2S audio codec

SPI display streaming

Camera interface (DCMI)

Data moves:

Double-buffer mode ensures no pauses or data loss

FIFO mode ensures bus efficiency and alignment

Burst mode ensures maximum throughput

Feature 1 — Fast-mode Plus I²C (FM+, up to 1 MHz)

Why does FM+ exist?

Why is FM+ “special” in STM32?

Feature 2 — Hardware Semaphores (HSEM) — great for multi-core STM32

Why do we need Hardware Semaphores?

Feature 3 — Burst DMA

Why bursts are faster?

Typical burst sizes:

Benefits:

When to use Burst DMA?

Feature 4 — DMA FIFO Mode

Modes:

Why FIFO is useful?

FIFO example:

Feature 5 — Double-Buffer Mode (Ping-Pong Mode)

Perfect for:

Example:

How these 3 features work together

Burst + FIFO + Double-buffer

Sign up

Login with SSO

Login to the community

Login with SSO

Scanning file for viruses.

This file cannot be downloaded