Graduate II

Question

STM32 with small flash + HAL + DEBUG + no Optimization...

Forum|Forum|1 year ago
July 23, 2024
16 replies
5661 views

... leads to "funny" results:

STM32L011 with 16 kB of flash, HAL inits and LED toggle in main, DEBUG, no optimization:

93% of flash used

I had not expected it to be that bad! :grinning_face_with_sweat:

Okay, I will probably not use too much HAL, will use Release version with optimization,
but I better select an L01 in the next bigger QFN package (4x4) to get 32 kB of flash.

This topic has been closed for replies.

M

MM..1

Graduate II

Your screenshot show many peripherals , then maybe your LED toggle count bitcoins too ...

A

Andrew Neil

Super User

Show the results with "Optimise for Debug" (-Og).

You should also take a look at the map file - is it actually HAL that's using all that space ... ?

:thinking_face:

K

KnarfB

Super User

The Memory Details tab gives you more insights.

hth

KnarfB

L

LCEAuthor

Graduate II

Yes, luckily I know all that - it was just a bad surprise and might deter some complete STM32 beginners.

So, this wasn't actually a call for help, just ... being "amazed" at how things are. Until now I only worked with "bigger" STM32 (G4, F7, H7), and occasionally using HAL or at project start wasn't a problem.

The list file showed mostly the "bloated" HAL inits - I actually forgot about the "Memory Details", thanks @KnarfB .

With release / optimization to size, it drops to 56%.

Record holder is HAL_RCC_Config with 1.15 kB, together with HAL_RCC_ClockConfig and HAL_RCCEx_PeriphCLKConfig taking up even more than 10% of the 16 kB.
These are the HAL setup functions I always use - I guess I have to overcome my laziness in that area.

BTW - RAM: the default CubeMx settings for heap and stack also took up ~90% :D

M

MM..1

Graduate II

In MX you can very easy switch some parts to LL

L

LCEAuthor

Graduate II

If this MCU will be used in a product, I will probably throw out the HAL stuff anyway (I don't like the LL either, so direct register settings for me).

I was just really shocked that a basically empty project - except for the inits - with all defaults from CubeMX leads to this memory usage: > 90% for flash and RAM.

A

Andrew Neil

Super User

@LCE wrote:
I was just really shocked that a basically empty project - except for the inits.

Probably, the inits are where 90% of the code lies!

So what do you get if you don't init any peripherals?

And then there's the C runtime support - how much did that amount to?

L

LCEAuthor

Graduate II

@Andrew Neil

> Probably, the inits are where 90% of the code lies!

Definitely!

I just switched to its next big brother with 32 kB SRAM, switched all to LL instead of HAL.

Release with no Opt: 9.57 kB flash

Release with Opt for size: 4.14 kB flash <- I hadn't expected that with LL

> And then there's the C runtime support - how much did that amount to?

Erm... where's that? Isn't that for memory allocation and stuff like that - which I don't use?

A

Andrew Neil

Super User

@LCE wrote:
@Andrew Neil
Erm... where's that? Isn't that for memory allocation and stuff like that - which I don't use?

There's a lot more to it than that!

L

LCEAuthor

Graduate II

Okay, I'll check that.

I'm far from being a software guy, still have more experience with 8-bitters and FPGAs than with STM32...

W

waclawek.jan

Super User

> still have more experience with 8-bitters

And do you use modern development environments together with clicking configurators and abstraction libraries with the 8-bitters?

While I don't have first-hand experience with these, I'm quite confident these days you can find some of these being capable of filling up the FLASH of a low-end target mcu quite safely, too.

It's 21st century, after all.

JW

A

Andrew Neil

Super User

@waclawek.jan wrote:
While I don't have first-hand experience with these, I'm quite confident these days you can find some of these being capable of filling up the FLASH of a low-end target mcu quite safely, too.

You certainly can - just take a look at ~~Atmel's~~ Microchip's ASF stuff for the AVRs ...

And, again, there's the C runtime - just a printf() is a great way to fill a small micro's Flash.

Or some floating-point ...

B

BarryWhit

Graduate II

Also note that CubeIDE by default compiles/links with --gc-sections as an "optimization". I bet without that flag being on by default, the skeleton binary wouldn't even fit in flash...

That said, engineers usually pick these tiny parts either for high-volume cost-sensitive applications, or because they to want to use the smallest package possible. I think it's a reasonable compromise to do development and debugging on a beefier partnum from the same family, and then shoehorn the release version into the cheapest part number that will do, for actual production. There's some risk involved in this approach, but it makes sense if you're working under certain kinds of constraints .

T

Tesla DeLorean

Graduate II

Frequently SMALL is what I want, the 3x3 QFN20 is perfect, ST's not had a good game here, TTSOP20 is just too massive, and honestly I don't have the space-time to prototype with larger parts, nor have to deal with higher costs/complexity of PCB and BGA with pin-in-pad issues (fill and planarization). People demand boards the size of a postage stamp, not a post card.

MAXIM had some very small CM4F with relatively large FLASH, 96MHz, 256KB/96KB, 4x4mm, overkill for my application for sure, but demonstrative of what the right process/geometry can deliver, beyond CM0(+). This is how you exorcise 8-bit MCU, not by barely replacing them, or forcing a redesign into 3V3. When you can pivot into a order of magnitude more effectiveness the choices/decisions get a lot simpler for the guys with bigger hats.

B

BarryWhit

Graduate II

There's constant pressure from china driving cost down and capabilities up. JLCPCB recently made via-in-pad free for 6 layers and up, and have had a 6 layer deal going for months now that makes them cheaper than 4 layers. There's still the complexity of layout to consider, but if someone really needs to get small then switching from pins to balls ultimately makes sense.

I've been meaning to try hand-assembling some WLCSP chips (they are marvels of miniature). But I'd probably have to go through caffeine withdrawal first, before attempting it - so maybe not.

Whenever I see a WLCSP, I can't help but think (by way of contrast) of those old spy movies featuring "cutting-edge" east-german phone bugs - the size of a motorola brick phone.

T

Tesla DeLorean

Graduate II

The 16KB part is pretty tight, especially with the float library pulled in.

For the L011 there's likely a drop in C0 or G0 that can maintain a small package with larger memory.

STM32C011F6U6TR
IC MCU 32BIT 32KB FLASH 20UFQFPN (3x3 mm)

Show more replies

Sign up

Login with SSO

Login to the community

Login with SSO

Scanning file for viruses.

This file cannot be downloaded