How to perform independent CRC on internal flash in STM32H7 (avoiding flash ECC / bus fault)?
I have a bootloader that attempts to verify the integrity of an application image before it transfers execution to the application. It does this by performing a CRC (using software) on the image. However, if the internal flash is corrupted somewhere inside the application that results in a double ECC error, then a bus fault exception will be thrown.
I've tried several methods of handling the bus fault, but so far none have worked. Does anyone have a suggestion on how to handle ECC errors in internal flash?
Things I've Tried that Did not Work
Skip the Offending Instruction
In the bus fault handler, increment the application program counter and return. This unfortunately didn't work because the ECC algorithm compiled itself into a fused instruction that loaded the offending location and incremented the offset at the same time (LDRB.W r2, [r3], #1). As it's a fused instruction, skipping it means that the loop counter doesn't increment and we just end up in an infinite loop. Although I could rewrite the algorithm to avoid this issue, it would likely become something specialized just for this chip family and issue; something I wish to avoid.
Enable Internal Flash Interrupts
In the hopes that having the flash interrupt (on double and single interrupts) enabled would somehow circumvent the bus fault; flash interrupts were duly configured. Unfortunately this just results in the bus fault first, then the interrupt handler and still no method to really correct the issue or make forward progress.
Disable Interrupts/Faults
Pavel A in https://community.st.com/s/question/0D53W00000Jkw8ZSAR/stm32h7-flash-write-returns-ok-yet-hard-fault-during-readback suggested that disabling faults with a "cpsid if" might work. Unfortunately that just leads to a double fault and lockup.
Things I'll Try Next
- Using the flash CRC engine to do a CRC precheck. If it can avoid issuing a bus fault and instead issue a CRC failure then I can know the memory region is safe to access.
- Issue a flash erase command inside the bus fault handler.
Reproducing this Issue
If anyone wants to play, the easiest way I've found of causing this corruption is to double program a flash word.
I ran across this forum post as well, but as far as I can tell no one has a good answer there either: https://community.st.com/s/question/0D50X0000AX8Hm3SQF/stm32h7-internal-flash-error.
