Visitor II

Solved

How to debug a hard fault exception for STM32?

Forum|Forum|7 years ago
October 26, 2018
5 replies
10458 views

I am currently using stm32f407 with FreeRTOS as operating system in MDK IDE. Recently, When I added a new function ( The new function is ospf routing algorithm, and it is very complex and involves many files ) to my project, I began to encounter hard fought exception. I use the call stack window to check the context of the exception and found that the context is uart_send function. Since I redirect the printf function to uart_send and use it to print debug information so uart_send is called most frequently. When I comment the new added ospf function, everything goes fine, no hard fault exception and uart_send goes fine. So, the hard fault exception comes from the new added ospf function. However, I have no idea which part of the ospf function goes wrong and the call stack window gave nothing useful.

I have no idea what to do next? Could anyone tell me what to do next? Is there a procedure to debug a hard falt exception?

Thanks in advance!

This topic has been closed for replies.

Best answer by yanhc

Finally, after I have changed the task stack size of the new added OSPF task from 500 to 2000, everything goes fine. However, I still cannot figure out how could task stack overflow cause the above strange phenomenon.

Or just as this link (https://www.micrium.com/detecting-stack-overflows-part-1-of-2/) says:

Stack overflows are common and can lead to some curious behaviors. In fact, whenever someone mentions that his or her application behaves “strangely,�? insufficient stack size is the first thing that comes to mind.

Although I am using a FreeRTOS operating system, my knowledge about os is very limited. It's time to make up this limitation for me.

Anyway, many thanks to @Community member @Community member @Bob S @Community member .

A

AFras

Visitor II

Are you using a large number of parameters to your debug printf ?

If so, I would check the buffer size the printf is expanded into.

Andy

Y

yanhcAuthor

Visitor II

Thanks for your reply!

I also tried to comment all the debug printf output. But I also encounter the hard fault exception. If I comment the new added ospf function, everything goes fine. So, I think the hard fault exception has no relation with the debug printf and it comes from the new ospf function. I just have no idea which part of the code goes wrong and have no experience in debugging hard fault exception.

T

Tesla DeLorean

Graduate II

The methods to chase down Hard Faults have been covered in recurrent topics.

Y

yanhcAuthor

Visitor II

Could you please provide the topic link? I have searched the stm32 forum and find limited information.

I found there is a Fault Reports in MDK as shown below with Call Stack. It can be seen that a Hard Fault is invoked and it is FORCED which means I should check other fault. Usage Fault is also invoked and it is an UNALIGNED access usage fault.

From the right of the above image, it can be seen that the call stack is very deep. Since uart1SendChar is tested before many times, so I guess somewhere of the new added ospf function may change the memory of others. Then, what should I do next to find which part of the code is wrong?

The following is the uart1SendChar code which is very simple. Then which memory of the following code is corrupted will cause the UNALIGNED access usage fault?

void uart1SendChar(u8 ch)
{ 
	while((USART1->SR&0x40)==0); 
 USART1->DR = (u8) ch; 
}

B

Bob S

Super User

Are you using the MPU? Search for "hardfault mpu" (without quotes, with "hardfault" as one word). Though the search results are flaky (for me) right now - lots of "check your connection" pop-up errors. So here is the link:

https://community.st.com/s/question/0D50X00009q5OZnSAM/hardfault-on-unaligned-access-after-enabling-mpu

Y

yanhcAuthor

Visitor II

Hi, Bob. Thanks for your reply.

I have check the my setting. The MPU (Memory Protection Unit) is not enabled.

After I have check further with my hard fault, I found out that my hard fault is a FORCED one and is escalated by Usage fault which is UNALIGNED. The following image is the screen capture after I enable the Usage Fault.

However, do you known how to find which line of code or which assembler instruction that cause this UNALIGNED fault?

I have redirect the printf function to uart send as following:

#define PUTCHAR_PROTOTYPE int fputc(int ch, FILE *f)
 
PUTCHAR_PROTOTYPE
{
 uart1SendChar(ch);
 return ch;
}
 
void uart1SendChar(u8 ch)
{ 
	while((USART1->SR&0x40)==0); 
 USART1->DR = (u8) ch; 
}

Thanks very much.

A

AvaTar

Visitor II

> I found out that my hard fault is a FORCED one and is escalated by Usage fault which is UNALIGNED.

This doesn't make much difference.

As suggested, tracking down of such faults is often covered here.

Using the debugger is probably the easiest. Perhaps an out-of-bound access, or an odd type cast.

The address of the unaligned address should tell you more (see your map file).

Y

yanhcAuthor

Visitor II

Hi, AvaTar!

Thanks for your reply!

I am using Keil MDK IDE (with FreeRTOS and LwIP and OSPF routing protocol). The MDK call stack screen capture is shown in the following image.

The call stack shows that 0x0801B758 of uart1SendChar cause the UNALIGNED Usage fault.

The instruction at address 0x0801B758 is as follows which is used to get the value of USART1->SR:

0x0801B758 8809 LDRH r1,[r1,#0x00]

Since the USART1 is a peripheral structure and SR (Status Register) is the first element, I cannot figure out how could this instruction cause the UNALIGNED Usage fault?

I also find a strange phenomenon. When the UNALIGNED Usage fault is caused, my uart output is as follows:

lwip_select: no timeout, returning 0

Interface Event et0: [InterfaceUp]Filter out Linklocal: FE80::200:D4FF:FE01:1/128Interface state change et0: Down -> Waitinglwip_select(4, 200062e8, 200062e4, 200062e0, tvsec=0 tvusec=0)

lwip_select: no timeout, returning 0

No retr

If I set a breakpoint in the uart1SendChar function before "No retr" is printed. Then, one letter after another, the whole sentence can be printed as follows:

lwip_select: no timeout, returning 0

Interface Event et0: [InterfaceUp]Filter out Linklocal: FE80::200:D4FF:FE01:1/128Interface state change et0: Down -> Waitinglwip_select(4, 200062e8, 200062e4, 200062e0, tvsec=0 tvusec=0)

lwip_select: no timeout, returning 0

No retransmission scheduled, next interface

However, if I run again without any breakpoint, the UNALIGNED Usage fault will also be triggered in another place.

How could the UNALIGNED Usage fault disappear when there is breakpoint? I am very confused.

Y

yanhcAuthorAnswer

Visitor II

Finally, after I have changed the task stack size of the new added OSPF task from 500 to 2000, everything goes fine. However, I still cannot figure out how could task stack overflow cause the above strange phenomenon.

Or just as this link (https://www.micrium.com/detecting-stack-overflows-part-1-of-2/) says:

Stack overflows are common and can lead to some curious behaviors. In fact, whenever someone mentions that his or her application behaves “strangely,�? insufficient stack size is the first thing that comes to mind.

Although I am using a FreeRTOS operating system, my knowledge about os is very limited. It's time to make up this limitation for me.

Anyway, many thanks to @Community member @Community member @Bob S @Community member .

C

Carl_G

Graduate II

I wish this has a proper solution and not just marked as solved. I am having the exact same thing. Hard fault triggered by something to do with sprintf and dumping diagnostic info over UART. It is also cleared by increasing the RTOS task stack size. I hope I find a better solution...

Sign up

Login with SSO

Login to the community

Login with SSO

Scanning file for viruses.

This file cannot be downloaded