Skip to main content
Graduate II
July 23, 2024
Question

STM32 with small flash + HAL + DEBUG + no Optimization...

  • July 23, 2024
  • 16 replies
  • 5661 views

... leads to "funny" results:

STM32L011 with 16 kB of flash, HAL inits and LED toggle in main, DEBUG, no optimization:

93% of flash used

I had not expected it to be that bad! :grinning_face_with_sweat:

Okay, I will probably not use too much HAL, will use Release version with optimization,
but I better select an L01 in the next bigger QFN package (4x4) to get 32 kB of flash.

STM32L011_HAL_DEBUG_noOpt.png

    This topic has been closed for replies.

    16 replies

    LCEAuthor
    Graduate II
    August 12, 2024

    Atmel's AVRs were my main MCU over the last 20 years or so... surely never used something like HAL or a library. :D

    Having recently worked so much with the H7, I thought I'd try a small STM32, because for that project I would have taken a (for us) new AVR, so I could go STM32.

    Anyway, I chose the L03 with 32 kB flash. And I love the Nucleos...

    Firmware's almost done, I only use LL stuff for clock setup and ADC, otherwise direct register, no float for printf.

    Biggest chunk in flash is now my standard debug UART interface with lots of const char strings for printf, but I'm far from the 32 kB limit, using size optimization though.

    Graduate II
    August 12, 2024

    I've tried to migrate us from old SiLabs 8051's where possible, I need something for basic initialization / customization. Typically needs to be physically small, but workable, so BGA and WSP generally not favoured by the crew in PCBA, or the more costly PCB

    The 16KB L011 worked well until feature creep set it, people wanted configuration and diagnostics for production, bit too tight, 32KB would have more headroom as all the libraries took their bite from the first 16KB

    Graduate II
    August 12, 2024

    Biggest chunk in flash is now my standard debug UART interface with lots of const char strings for printf, but I'm far from the 32 kB limit, using size optimization though.

     

    I was sure something like this existed, and I was not disappointed:

    https://github.com/atomicobject/heatshrink

    purported foorprint of decoder is about 1K flash + 100bytes RAM (min, your can ram usage for decoding speed).

    (Update: also found Segger SMASH. I'm sure others exist. It's an obvious idea.)

     

    If you have enough low-entropy data, it might provide a net gain. One approach to making the most efficient use of small controllers is to trade what you have in excess to get some more of what is scarce. Can you give away some SRAM to gain some Flash?

     

    LCEAuthor
    Graduate II
    August 13, 2024

    Interesting stuff from all of you! :thumbs_up:

    Yeah, the size... We have relatively small lot numbers, and in most products we have enough space to avoid WLCSP, BGA, and most often - if the manufacturer allows - even QFN, our "go-to" SMD format for passives is still 0603, nothing smaller if possible.
    This recent project with the L0 is very space limited, but our production manager still begged me to stay away from BGA and anything smaller than 0603 - while the sales manager side is asking for more features, you know the game... :grinning_face_with_sweat:

    Super User
    September 25, 2024
    LCEAuthor
    Graduate II
    September 26, 2024

    Interesting that the compiler automatically assumes double floating point calculations, these "__aeabi_dxxx" libs take some more space.

    I found that I can suppress this with adding "-fsingle-precision-constant" to gcc in the compiler settings:

    LCE_0-1727330112817.png

     

    Gave me 4 kB flash back.

    Is that the correct way? Feels somehow not so elegant...

    Super User
    September 26, 2024

    @LCE wrote:

    Interesting that the compiler automatically assumes double floating point calculations,


    Indeed:

    https://mcuoneclipse.com/2019/03/29/be-aware-floating-point-operations-on-arm-cortex-m4f/

     

    Visitor II
    July 20, 2025

    Utilize "Optimize for Debug" (-Og) to display the outcomes. You should also look at the map file to see if HAL is actually taking up all that space...