Skip to main content
Associate II
September 27, 2024
Question

Issue with LWIP heap

  • September 27, 2024
  • 10 replies
  • 5890 views

Hello everyone,

I'm having issues with the Heap of the LWIP that is used/full, and I can't do any mem_malloc. 
I have dig a bit on why the log of LWIP tells me that  it couldn't allocate memory ("etharp_raw: could not allocate pbuf for ARP request"). As far as i understand, the mem_malloc used the struct mem *lfree variable to handle the allocation for the LWIP and which is the next/previous chuck and also if the next chunk is being used or not.

 

 

struct mem {
 /** index (-> ram[next]) of the next struct */
 mem_size_t next;
 /** index (-> ram[prev]) of the previous struct */
 mem_size_t prev;
 /** 1: this area is used; 0: this area is unused */
 u8_t used;
#if MEM_OVERFLOW_CHECK
 /** this keeps track of the user allocation size for guard checks */
 mem_size_t user_size;
#endif
};

 

 

My issue is that the variable used in the mem struct is never null or zero. Which lead to the mem_malloc to always return NULL even after a fresh upload or power up.  

I manage to track down what appear to be written in the memory area dedicated for the LWIP heap. It occurs in the HAL_ETH_MspInit ("void HAL_ETH_MspInit(ETH_HandleTypeDef* ethHandle)") with the line "__HAL_RCC_ETH1MAC_CLK_ENABLE();"  as you can see in the picture

> FAV_Before-__HAL_RCC_ETH1MAC_CLK_ENABLE.PNG

FAV_Before-__HAL_RCC_ETH1MAC_CLK_ENABLE.PNGFAV_Before-__HAL_RCC_ETH1MAC_CLK_ENABLE.PNG

> FAV_After-__HAL_RCC_ETH1MAC_CLK_ENABLE.PNGFAV_After-__HAL_RCC_ETH1MAC_CLK_ENABLE.PNGFAV_After-__HAL_RCC_ETH1MAC_CLK_ENABLE.PNG

Before the call, the ram is zero, after the call the ram is full.  You can also see in the Expressions panel on the right that before the call, lfree is empty, and after the call lfree is not empty anymore and point to some random chunk.

 

The really strange thing is that the same code can work for a while, but once the ram is used, i'm stuck.

 

I'm using the STM32H743ZI on a custum PCB with the LAN8742 .
I also tested the same code on the Nucleo-H743ZI2 and i'm having difference result with the same function 

> Nucleo-Before-__HALL_RCC_ETH1MAC.PNG

TCP_OK_Nucleo-Before-__HALL_RCC_ETH1MAC.PNGTCP_OK_Nucleo-Before-__HALL_RCC_ETH1MAC.PNG

> Nucleo-After-__HALL_RCC_ETH1MAC.PNGTCP_OK_Nucleo-After-__HALL_RCC_ETH1MAC.PNGTCP_OK_Nucleo-After-__HALL_RCC_ETH1MAC.PNG

As you can see with the nucleo, the ram is not null/zero before the call and after the call as well, but the field "used"  is zero, and i can therefore call mem_malloc without any issue. 

 

I'm providing you the ETH_ICO config, the LWIP_ICO config, and the change i made in the FLASH.ld

> ETH_IOC_Config.PNG

ETH_IOC_Config.PNGETH_IOC_Config.PNG

> LWIP_FLASH_ld.PNG

LWIP_FLASH_ld.PNGLWIP_FLASH_ld.PNG

> LWIP_IOC_Config.PNG

LWIP_IOC_Config.PNGLWIP_IOC_Config.PNG

 

For the LWIP and ETH configuration in the ICO, I have follow this guide/serie : 

https://controllerstech.com/stm32-ethernet-1-connection/ 

 

If you have any question, feel free to ask.
I hope someone could help as I'm stuck with this issue since last week without any lead.

 

 

10 replies

Pavel A.
Super User
September 27, 2024

Hi,

It's hard to figure out what is wrong there. For best result, zero all the mentioned areas in your init code (descriptors memory, LwIP pool, buffers). 

 

AGP_29Author
Associate II
September 27, 2024

Sure, here is the picture before the memset of the three arena declar in the flash.ld : 

 

FAV_BEFORE-__HAL_RCC_ETH1MAC_And_Before_memset.PNGFAV_BEFORE-__HAL_RCC_ETH1MAC_And_Before_memset.PNG

After the memset but before the "__HAL_RCC_ETH1MAC_CLK_ENABLE()" : 

FAV_BEFORE-__HAL_RCC_ETH1MAC_And_After_memset.PNGFAV_BEFORE-__HAL_RCC_ETH1MAC_And_After_memset.PNG

Here is after the memset and after the "__HAL_RCC_ETH1MAC_CLK_ENABLE()" : 

FAV_AFTER-__HAL_RCC_ETH1MAC_And_After_memset.PNGFAV_AFTER-__HAL_RCC_ETH1MAC_And_After_memset.PNG

 

There might be a better way to zero those arena with the ld file but I don't how to.

Pavel A.
Super User
September 28, 2024

>There might be a better way to zero those arena with the ld file but I don't how to.

memset is just enough. So hmm. something strange goes here.

 

AGP_29Author
Associate II
September 30, 2024

Any update or lead ? 

Like I said, I'm stuck with this issue since last week without any lead.

Andrew Neil
Super User
September 30, 2024

Is it a generic LwIP issue, or something specific to ST's port?

Perhaps ask on the LwIP forum/mailing list?

https://savannah.nongnu.org/projects/lwip 

(when cross-posting, always give links between threads)

A complex system that works is invariably found to have evolved from a simple system that worked.A complex system designed from scratch never works and cannot be patched up to make it work.
Pavel A.
Super User
September 30, 2024

 This does not look like LwIP issue. Rather, some conflict of address allocations or DMA run away.

AGP_29Author
Associate II
October 1, 2024

Any update or lead ? 

LCE
Principal II
October 1, 2024

Have you enabled the lwIP debug messages, really helpful to check that via UART (I'm using UART TX DMA at ~1Mbps).

There's so much stuff going on... and lwIP's function "trees" in some parts are so terrible to trace.

AGP_29Author
Associate II
October 1, 2024

Yes I did, that's how I figure why my TCP Server wasn't working anymore and how the mem_malloc() work.

The log said that it couldn't allocate memory for pbuf ("etharp_raw: could not allocate pbuf for ARP request") 

LCE
Principal II
October 2, 2024

Have you enabled the lwip statistics?

Look for LWIP_STATS and MEM_STATS in some opt.h file.

That might give some more info:

#if LWIP_STATS

void UartLwipStatsMem(void)
{
	uint16_t i = 0;
	struct stats_mem *psMem = &lwip_stats.mem;
	const char *pszMemName = "HEAP";

	uart_printf("\n\rMEM %s\n\r", pszMemName);
	uart_printf("used %"MEM_SIZE_F" bytes of %"MEM_SIZE_F" kB (%"MEM_SIZE_F" max used)\n\r",
						psMem->used, (psMem->avail / 1024), psMem->max);
	uart_printf("# ERR: %"STAT_COUNTER_F"\n\r", psMem->err);

	uart_printf("\n\rlwIP internal memory pools (%u):\n\r", MEMP_MAX);
	for( i = 0; i < MEMP_MAX; i++ )
	{
		psMem = lwip_stats.memp[i];
		uart_printf("\n\r%s \t", psMem->name);
		uart_printf("used %"MEM_SIZE_F" of %"MEM_SIZE_F" \tmax %"MEM_SIZE_F,
							psMem->used, psMem->avail, psMem->max);
		if( psMem->err != 0 ) uart_printf("\n\r\t# ERR: %"STAT_COUNTER_F, psMem->err);
	}
	uart_printf("\n\r");
}
#endif


#if( 1 )
void UartLwipStatsIf(struct stats_proto *pProto, const char *pszName)
{
	uart_printf("\n\r%s\n\r", pszName);
	uart_printf("xmit: %"STAT_COUNTER_F"\n\r", pProto->xmit);
	uart_printf("recv: %"STAT_COUNTER_F"\n\r", pProto->recv);
	uart_printf("fw: %"STAT_COUNTER_F"\n\r", pProto->fw);
	uart_printf("drop: %"STAT_COUNTER_F"\n\r", pProto->drop);
	uart_printf("chkerr: %"STAT_COUNTER_F"\n\r", pProto->chkerr);
	uart_printf("lenerr: %"STAT_COUNTER_F"\n\r", pProto->lenerr);
	uart_printf("memerr: %"STAT_COUNTER_F"\n\r", pProto->memerr);
	uart_printf("rterr: %"STAT_COUNTER_F"\n\r", pProto->rterr);
	uart_printf("proterr: %"STAT_COUNTER_F"\n\r", pProto->proterr);
	uart_printf("opterr: %"STAT_COUNTER_F"\n\r", pProto->opterr);
	uart_printf("err: %"STAT_COUNTER_F"\n\r", pProto->err);
	uart_printf("cachehit: %"STAT_COUNTER_F"\n\r", pProto->cachehit);
}
#endif
#endif

 

AGP_29Author
Associate II
October 25, 2024

Sorry I was busy with something else. Fortunately, I didn't encounter this bug anymore. I think it's related to the upload of the code that has a side effect on the RAM, but I could be wrong.

I will probably come back if this bug occurred again.

Explorer
February 5, 2026

I have the same issue on NUCLEO-H743ZI2 board.

Before calling __HAL_RCC_ETH1MAC_CLK_ENABLE(); macro in HAL_ETH_MspInit() function, all LWIP heap area is zero. After calling this macro, LWIP heap area is corrupted.

Before calling the  __HAL_RCC_ETH1MAC_CLK_ENABLE();

esdevhk_0-1770326767087.png

After calling the  __HAL_RCC_ETH1MAC_CLK_ENABLE();

esdevhk_1-1770326805367.png