Skip to main content
Graduate
April 5, 2024
Solved

STM32H74 ECC Errors

  • April 5, 2024
  • 4 replies
  • 3027 views

Hi,

andy_long_0-1712312892764.png

Some doubts on the ECC functionality on STm32h74x.

  1. How can we know if the data can be corrected for cases DEDF and DEBWDF ? 
  2. I always see the DEDF and SEDF fields set in my case. This bits are set when the control is hit the main(), and then I cleared it forcefully by writing 0x03 to M3SR and M4SR (when the control is at the red line). But, these bits were set immediately when I stepped into the next instruction (see below snapshot). Why is this so ?

andy_long_0-1712315036609.png

andy_long_1-1712315066857.png

 

    This topic has been closed for replies.
    Best answer by Bubbles

    Hello @andy_long ,

    the source code is not really telling, I'd need to see the disassembly and the CPU registers step by step to have full picture.

    But there is not much depth to the ECC, topic, it's really simple. Each time the memory location is read, ECC is checked. The location may be more than one word, it can be even 128bits, depending on memory type. If single error is corrected, it's only corrected on the data read, not in the original location. To prevent double bit error developing, it's advised to use the corrected data and rewrite the faulty one, removing the single bit error. This way, once another error occurs, it will be only single bit again.

    You said you are not reading the same location, but two adjacent variables in SRAM may in some cases share the same ECC.

    BR,

    J

    4 replies

    Technical Moderator
    April 5, 2024

    Hello,

    ECC Single error is detected and automatically corrected by HW.

    ECC Double error is detected but not corrected.

    andy_longAuthor
    Graduate
    April 5, 2024

    Thank you for your prompt reply @mƎALLEm . I had to update my question, could you please have another look ? 

    why does the flowchart check "if the data can be corrected" for Double error if it cannot be corrected ?

    Technical Moderator
    April 5, 2024

    You didn't mention the document you're referring to. After some search it founds out that you're referring to the AN5342 "How to use error correction code (ECC) management for internal memories
    protection on STM32 MCUs"

    @Bubbles could help you on this.

    ST Employee
    April 5, 2024

    PS then regarding the second question - maybe your code is reading some uninitialized memory. Each time the read is done, ECC is checked and the error pops again. Clearing error flags won't help, you need to rewrite that faulty memory location.

    andy_longAuthor
    Graduate
    April 5, 2024

    @Bubbles Thank you for your replies. Point 1 is clear.

    The code snippet I showed is before the scheduler is started, which means there is no other code running concurrently. There are memories not initialized but my understanding is that you will get single/double error when you try to read the uninitialized memory. 

    In my case, as mentioned before, I am clearing the status registers (but not rewriting the memory location) where the red line shows and a single step in the debugger shows single error and double error bits set. Is this because the previous faulty memory location is not rewritten (but I am not reading that location) ?

    BubblesAnswer
    ST Employee
    April 8, 2024

    Hello @andy_long ,

    the source code is not really telling, I'd need to see the disassembly and the CPU registers step by step to have full picture.

    But there is not much depth to the ECC, topic, it's really simple. Each time the memory location is read, ECC is checked. The location may be more than one word, it can be even 128bits, depending on memory type. If single error is corrected, it's only corrected on the data read, not in the original location. To prevent double bit error developing, it's advised to use the corrected data and rewrite the faulty one, removing the single bit error. This way, once another error occurs, it will be only single bit again.

    You said you are not reading the same location, but two adjacent variables in SRAM may in some cases share the same ECC.

    BR,

    J

    andy_longAuthor
    Graduate
    April 10, 2024

    @Bubbles 

    Thanks... I will have a look