STM32U535 Backup registers access wait states
Hi all,
I am running into an issue related to writing backup registers on the STM32U535xx MCU. When I compile my code with -Oz, the write fails, whereas with -Os, it succeeds. The relevant C code is pretty straightforward:
LL_PWR_EnableBkUpAccess();
LL_RTC_BKP_SetRegister(RTC, LL_RTC_BKP_DR0, value);
LL_PWR_DisableBkUpAccess();
I ended up comparing the output assembly of the two compiler levels and found the cause below:

Left is compiled with -Oz and is not working (the write is not performed). Right is compiled with -Os and works as expected. The instruction labeled "store" is the STR operation which writes to TAMP->BKP0R. The instruction labeled "lock" writes to PWR->DBPR (disabling access to the backup domain).
As you can see, with -Oz there are 2 instructions between the "store" and "lock" operations, whereas with -Os there are 3. Indeed, when I add a __NOP() before the LL_PWR_DisableBkUpAccess() call, then it works as expected even with -Oz. Lowering the sysclk from 160 MHz down to 16 MHz does not change the situation. This seems to indicate that a write to the backup registers requires 3 additional wait cycles on this controller.
The same code ran on an STM32U575xx works as expected, even with only 2 instructions between the "store" and "lock" instructions.
I was not able to find any information regarding backup register latency in the STM32U5 reference manual, datasheet or device errata. However, I did find a presentation for STM32C0 which indicates that backup register writes do require 3 extra wait cycles. Is this also the case on STM32U535xx? Are there other STM32 controllers where this is a known issue?
