Skip to main content
Brian H
Senior
November 4, 2022
Solved

Why might setting BFB2 cause printf to stop working?

  • November 4, 2022
  • 8 replies
  • 4842 views

I've written a fairly simple test case for using the system bootloader to boot from flash bank 2 on a STM32F429ZIT6U (developing on a Nucleo F429ZI board). With stdio redirected to a UART, I'm able to direct the code to:

  1. Mass-Erase Bank 2
  2. Copy the entirety of Bank 1 to Bank 2
  3. Set the BFB2 bit

The function that sets BFB2 looks like this:

void boot_bank_select(uint8_t bank_bit) {
 FLASH_AdvOBProgramInitTypeDef optionbits = {
 .OptionType = OPTIONBYTE_BOOTCONFIG,
 .BootConfig = (bank_bit) ? OB_DUAL_BOOT_ENABLE : OB_DUAL_BOOT_DISABLE
 };
 HAL_FLASH_OB_Unlock();
 HAL_FLASHEx_AdvOBProgram(&optionbits);
 printf("Waiting for option bit write...\r\n");
 HAL_StatusTypeDef rc = HAL_FLASH_OB_Launch();
 if(rc == HAL_OK)
 printf("Bank select completed.\r\n");
 else
 printf("Bank select failed.\r\n");
}

By placing a few strategic breakpoints, I can see that once the option bit write takes place, printf starts causing an endless reboot cycle. Code in main() executes up to the first place printf() is called, then it reboots.

Looking at disassembly, I can see a point where a "bx lr" jumps into the middle of the boilerplate startup code, so I don't think an actual fault or reset is occuring; it's more likely a case of stack corruption. However, the situation persists through hard resets; I can only regain control of the chip by using STProg to clear the BFB2 option bit.

I'm at a loss. Bank 2 is identical to Bank 1; I know this because the code itself copies it directly, and I can download and compare the two banks via STProg and see that they're identical. So why would having the bank switched cause the code to jump to strange places?

I can provide all of the code if necessary.

This topic has been closed for replies.
Best answer by Brian H

OH MY $DEITY I SOLVED IT.

Based on this one short paragraph in AN4767 that I managed to gloss over before:

"Beware that the VTOR reset value is zero, in case of BFB2 option active, it will by default point to system memory."

So yes, of course, if BFB2 is set and VTOR isn't re-pointed appropriately, the very first interrupt that comes along after the bank switch will send execution back into system memory. That's why I first saw it in the context of printf, which led to UART interrupts.

If the first instruction in main (or, really, any that happen before an interrupt) properly point the VTOR back at 0x0800 0000, the core stays happy in application code.

8 replies

gbm
Principal
November 5, 2022

It may be something related to cache and prefetch buffer operation. Try to disable all these mechanisms before bank switching. If this doesn't help, then move the bank switch routine to RAM and execute it from there. In sucha a case the routine should not call any routines from Flash - neither HAL nor printf; just make your own routine being an equivalent of HAL_FLASH_OB_Launch.

My STM32 stuff on github - compact USB device stack and more: https://github.com/gbm-ii/gbmUSBdevice
Pavel A.
Super User
November 5, 2022

While waiting for better replies, replace the printfs to direct UART output (HAL_UART_Transmit...).

and call __ISB() after HAL_FLASH_OB_Launch (this may be too late, though...)

Brian H
Brian HAuthor
Senior
November 7, 2022

Thank you both for your answers. They are sensible suggestions, but there's a wrinkle that perhaps I didn't express clearly: The problem persists after a hard reset / power cycle. I wouldn't expect cache / prefetch / etc. problems to remain after the chip has been fully reset. At that point, the cache and pipeline is flushed anyway, no? Starting from a hard reset, with an identical image in bank 2, I don't understand why the behavior should be any different from bank 1.

I'll still try moving the bank select into RAM (and removing all calls to external methods) just to see what happens.

Pavel A.
Super User
November 8, 2022

> once the option bit write takes place, printf starts causing an endless reboot cycle. Code in main() executes up to the first place printf() is called

Do you mean that the first printf in your snippet, line 8, already causes reboot?

Are there earlier prints in main() that work?

Brian H
Brian HAuthor
Senior
November 8, 2022

Here's the first few bits of main(), up to the first printf:

int main(void) {
 HAL_Init();
 SystemClock_Config();
 MX_GPIO_Init();
 MX_CRC_Init();
 MX_USART3_UART_Init();
 printf("Good morning!\r\n");
 printf("Waiting for debugger...\r\n");
 // ...
 

If B2BF is set and I set a breakpoint at line 5, this happens:

  1. Hit breakpoint at line 5
  2. "Step over" and hit line 6
  3. "Step over" and hit line 7
  4. "Step over" and back to 1 (the breakpoint at line 5 is hit)

If I use ST-Prog to clear the B2BF option bit, the entire code runs as expected.

Brian H
Brian HAuthor
Senior
November 8, 2022

Ok, the printf thing is definitely a red herring. Here's my latest modification to main, which examines the SYSCFG-->MEMRMP register to see which bank is booted, and does nothing but run a timer and blink an LED:

int main(void) {
 HAL_Init();
 SystemClock_Config();
 MX_GPIO_Init();
 MX_CRC_Init();
 MX_USART3_UART_Init();
 uint32_t timer = 0;
 if(SYSCFG->MEMRMP & 0x100) {
 while (1) {
 if ((++timer % BLINK_DELAY) == 0) {
 uint32_t odr = LD1_GPIO_Port->ODR;
 LD1_GPIO_Port->BSRR = ((odr & LD1_Pin) << 16) | (~odr & LD1_Pin);
 }
 }
 } else {
 while ((huart3.Instance->SR & 0x20) == 0) {
 if ((++timer % BLINK_DELAY) == 0) {
 uint32_t odr = LD3_GPIO_Port->ODR;
 LD3_GPIO_Port->BSRR = ((odr & LD3_Pin) << 16) | (~odr & LD3_Pin);
 }
 }
 peek_char();
 }

The not-remapped version is a bit more complicated so that I can break out of the blink phase and move on to the rest of the application.

Once again, when B2BF is cleared, the application runs completely as expected. If B2BF is set, I can set a breakpoint at line 11 that never gets hit and execution winds up back at the top of main. A breakpoint at line 10 does get hit.

I'm really stymied. It doesn't seem to be related to crossing compilation unit boundaries, because HAL_Init() is in a separate compilation unit.

Pavel A.
Super User
November 8, 2022

Set breakpoints in disassembly view, to be sure.

Even use a hardcoded breakpoint: __BKPT(n)

Then step by instruction.

Brian H
Brian HAuthor
Senior
November 8, 2022

Thanks for the input, Pavel. That is actually exactly what I've been doing lately. Here's a snippet:

167 			if ((++timer % BLINK_DELAY) == 0) {
0800075e: ldr r3, [r7, #36] ; 0x24
08000760: adds r3, #1
08000762: str r3, [r7, #36] ; 0x24
08000764: ldr r2, [r7, #36] ; 0x24
08000766: lsrs r3, r2, #3
08000768: ldr r1, [pc, #540] ; (0x8000988 <main+592>)
0800076a: umull r1, r3, r1, r3
0800076e: lsrs r3, r3, #8
08000770: movw r1, #25000 ; 0x61a8
08000774: mul.w r3, r1, r3
08000778: subs r3, r2, r3
0800077a: cmp r3, #0
0800077c: bne.n 0x800075e <main+38>
169 				 uint32_t odr = LD1_GPIO_Port->ODR;
0800077e: ldr r3, [pc, #524] ; (0x800098c <main+596>)
08000780: ldr r3, [r3, #20]
08000782: str r3, [r7, #4]

I can put a breakpoint at line 13 (the cmp) and line 16 (the instruction if the branch is not taken). I also put a breakpoint at the top of ResetHandler. If, while the debugger is stopped at line 13, I manually change the value in r3 to 0, single-step, then I wind up at line 16 like I should and, a few single-steps later, the LED changes state. But if I clear the breakpoint at line 13 and just hit "continue", I wind up back in the reset handler. Further, no bits are set in any of the fault registers.

Edit: Expanded the amount of disassembly; updated line number references in the text

Brian H
Brian HAuthorBest answer
Senior
November 14, 2022

OH MY $DEITY I SOLVED IT.

Based on this one short paragraph in AN4767 that I managed to gloss over before:

"Beware that the VTOR reset value is zero, in case of BFB2 option active, it will by default point to system memory."

So yes, of course, if BFB2 is set and VTOR isn't re-pointed appropriately, the very first interrupt that comes along after the bank switch will send execution back into system memory. That's why I first saw it in the context of printf, which led to UART interrupts.

If the first instruction in main (or, really, any that happen before an interrupt) properly point the VTOR back at 0x0800 0000, the core stays happy in application code.

scott.5490429380353516E12
Associate
March 1, 2024

So I had this same problem but an older version build from the Atollic version did not have this problem.  Your solution had me look at SystemInit()  so in Atollic it had following so VTOR was set 
/* Configure the Vector Table location add offset address ------------------*/
#ifdef VECT_TAB_SRAM
SCB->VTOR = SRAM_BASE | VECT_TAB_OFFSET; /* Vector Table Relocation in Internal SRAM */
#else
SCB->VTOR = FLASH_BASE | VECT_TAB_OFFSET; /* Vector Table Relocation in Internal FLASH */
#endif
But newer has 
#if defined(USER_VECT_TAB_ADDRESS)
/* Configure the Vector Table location -------------------------------------*/
SCB->VTOR = VECT_TAB_BASE_ADDRESS | VECT_TAB_OFFSET;
#endif

By default USER_VECT_TAB_ADDRESS is not enabled so this call is not helping.

I thought I would provide this nugget of information. The BFB2 can not be changed when you have configured RBP at level 2.  So you must instead keep BFB2 set and use Bank2 as your temporary code holder.
Step 1.  Running in Bank 1 you erase Bank 2. and then copy the new firmware into Bank2 but DO NOT write the 1st 32 or 64-bit value.
Step 2.  Once the last block is received and all your internal CRC checking show all data is received then do that 1st write last and soft reset.  Reason: BFB2 check Bank2 1st address for a RAM value so code will run from Bank1 if a problem happens during writing.
Step 3. After reset, the new code runs from Bank2.  At start-up, the code checks to see if running in Bank 2.  If so then it erases Bank 1 and copy Bank 2 into Bank 1.  This can be a straight copy because Bank 2 remains primary and Bank 1 is not checked.
Step 4. So now Bank 1 and Bank 2 have the same code.  But code is still running from Bank 2. Now you must  have a RAM function, that simply erases Bank2 and does a software reset.
Step 5. Microcontroller is in normal condition where Bank 1 contains the code and Bank 2 is erased and BFB2 is still set.  Some reference document say code in Bank2 will run when BFB2 is set (STM32L4x6) but actually it just checks to see if Bank 2 has valid initial stack pointer by looking at the 1st word.  If not valid then it will run the code in Bank 1. 
Extra: If you must save parameters in Flash that must be retained (i.e. defaults will not work),  Code and parameters need to fix into one bank and erase of Bank2 only needs to be the 1st block where the initial stack pointer is located.