Skip to main content
Visitor II
March 11, 2022
Question

ETH_DMACSR_RBU error occurs and stalls the Ethernet receive on STM32H7. Is there a way around this issue with the DMA?

  • March 11, 2022
  • 4 replies
  • 9194 views

Hi,

I've been using the STM32H7 example which uses the new reworked Ethernet driver (https://github.com/STMicroelectronics/STM32CubeH7).

Changed the application layer of the example (LwIP_HTTP_Server_Netconn_RTOS) and during larger transfers, the DMA error (ETH_DMACSR_RBU) occurs and simultaneously Wireshark shows that the communication has issues (retransmissions, tcp window full) and the communication stops for a while.

Code below is located in the ethernetif.c file.

/**
 * @brief Ethernet DMA transfer error callback
 * @param heth: ETH handle
 * @retval None
 */
void HAL_ETH_ErrorCallback(ETH_HandleTypeDef *heth)
{
 SEGGER_RTT_printf(0, "DMA error occurred! %d\n",HAL_ETH_GetDMAError(heth) );
 if((HAL_ETH_GetDMAError(heth) & ETH_DMACSR_RBU) == ETH_DMACSR_RBU)
 {
 SEGGER_RTT_printf(0, "DMA error occurred - releasing semaphore \n");
 osSemaphoreRelease(RxPktSemaphore);
 }
}

Here is the Wireshark picture which shows the issue:

0693W00000KcE0VQAV.png 

How can this be fixed?

It causes the low throughput which is a big problem for our use case.

If this is an issue with the driver, what version to use then?

The important project files are in the attachment.

Can you please give a suggestion on what might be the issue.

Thanks

    This topic has been closed for replies.

    4 replies

    Visitor II
    June 14, 2022

    I got stuck in the same situation. Under stress-test I get ETH_DMACSR_RBU (Rx DMA error) after a few minutes of operation. HAL driver code is a newest from STM32 github.com page. An ethernetif.c is also taken from recent H7 example. Just recreating DMX descriptors table doesn't help.

    My error callback:

    void HAL_ETH_ErrorCallback(ETH_HandleTypeDef *heth)
    {
     if((HAL_ETH_GetDMAError(heth) & ETH_DMACSR_RBU) == ETH_DMACSR_RBU)
     {
    	printf( "ETH DMA Rx Error\n" );
    	osSemaphoreRelease(RxPktSemaphore);
    	// Clear RBUS ETHERNET DMA flag
    	heth->Instance->DMACSR = ETH_DMACSR_RBU;
    	// Resume DMA reception
    	heth->Instance->DMACRDTPR = 0;
     }
     if((HAL_ETH_GetDMAError(heth) & ETH_DMACSR_TBU) == ETH_DMACSR_TBU)
     {
    	printf( "ETH DMA Tx Error\n" );
    	osSemaphoreRelease(TxPktSemaphore);
    	//Clear TBU flag to resume processing
    	heth->Instance->DMACSR = ETH_DMACSR_TBU;
    	//Instruct the DMA to poll the transmit descriptor list
    	heth->Instance->DMACTDTPR = 0;
     }
    }

    According to all sources I found reset of DMACSR and DMACRDTPR/ETH_DMACSR_TBU should resume DMA controller, but it is not.

    Visitor II
    June 23, 2022

    Hi ktrofimo,

    I managed to remove the issue with the stalling by modifying the lwipopts.h file, more

    precisely:

    Modified the TCP_WND which is the tcp receive window. If set to high the download will be quicker but it can stall.

    Maybe there is a disbalance of the send and receive.

    Also, I increased the TCP_SND_BUF.

    Here are my settings:

    #define TCP_MSS (1500 - 40)	 /* TCP_MSS = (Ethernet MTU - IP header size - TCP header size) */
     
    /* TCP sender buffer space (bytes). */
    #define TCP_SND_BUF (12*TCP_MSS)
     
    /* TCP_SND_QUEUELEN: TCP sender buffer space (pbufs). This must be at least
     as much as (2 * TCP_SND_BUF/TCP_MSS) for things to work. */
    #define TCP_SND_QUEUELEN (2* TCP_SND_BUF/TCP_MSS)
     
    /* TCP receive window. */
    #define TCP_WND (6*TCP_MSS)

    Visitor II
    June 23, 2022

    Regarding to​ DMA error I also found that code:

    ```c

    heth->Instance->DMACSR = ETH_DMACSR_TBU;

    heth->Instance->DMACTDTPR = 0;

    ​```

    shouldn't be called from within interrupt causing DMA failures as soon you stop at any breakpoint. ​(see discussion at https://github.com/STMicroelectronics/STM32CubeH7/issues/222#issuecomment-1159086674)

    Also ​ethernet transmit could hang on osSemaphoreAcquire( TxPktSemaphore ) inside low_level_output()

    Looks like ​this semaphore should be cleared at DMA error same way as RxPktSemaphore is cleared there already:

    ```c

    void HAL_ETH_ErrorCallback(ETH_HandleTypeDef *heth)

    {

    if((HAL_ETH_GetDMAError(heth) & ETH_DMACSR_RBU) == ETH_DMACSR_RBU)

    {

    // ETH DMA Rx Error

    osSemaphoreRelease(RxPktSemaphore);

    }

    + if((HAL_ETH_GetDMAError(heth) & ETH_DMACSR_TBU) == ETH_DMACSR_TBU)

    + {

    + // ETH DMA Tx Error\n" );

    + osSemaphoreRelease(TxPktSemaphore);

    + }

    }

    ```

    See ​https://github.com/STMicroelectronics/STM32CubeH7/issues/224

    ​I will check your lwIP settings and will see how it goes. Thanks!

    Visitor II
    July 1, 2022

    Can You try to increase buffers in the ethernetif.c on line 47:

    #define ETH_RX_BUFFER_SIZE      1000

    change it to 1536

    Visitor II
    July 1, 2022

    Have it by default from CubeIDE set as 1524. Will set it to 1536 and try again.​

    Super User
    July 1, 2022

    1524 does not have the correct alignment. 1536 is multiple of 512 bytes, and multiple of the cache line size.

    Explorer
    August 2, 2022

    >How can this be fixed?

    RM0433 rev 7 page 2809:

    "In the Suspend state, the DMA tries to acquire the descriptor again (and thereby return

    to step 3). A poll demand command is triggered by writing any value to the Channel Tx

    descriptor tail pointer register (ETH_DMACTXDTPR) when it receives a Transmit Poll

    demand and the Underflow Interrupt Status bit is cleared. If the application stopped the

    DMA by clearing Bit 0 of Transmit control register of corresponding DMA channel, the

    DMA enters the Stop state."

    RM0433 rev 7 page 2872:

    "If the descriptors are not owned by the DMA (or no descriptor is available), the DMA

    goes into Suspend state. The transmission or reception can be resumed by freeing the

    descriptors and writing the ETH_DMACTXDTPR (see Channel Tx descriptor tail

    pointer register (ETH_DMACTXDTPR) and ETH_DMACRXDTPR (see Channel Rx

    descriptor tail pointer register (ETH_DMACRXDTPR))."

    At https://community.st.com/s/question/0D50X0000C6eNNSSQ2/bug-fixes-stm32h7-ethernet I'd recommended using RBU. But if receive buffers can often be exhausted (and presumably no memory can be spared for more) the additional RBU handling will only waste cycles. Instead you might

    • clear RBUE in ETH_DMACIER to remove operation of RBU,
    • always write ETH_DMACTXDTPR after enabling DMA descriptors, and
    • following buffer exhaustion, notify your Ethernet-receive-thread to repopulate descriptors and update ETH_DMACTXDTPR to exit suspend state when the next buffer becomes available.
    Graduate II
    August 28, 2022

    You meant ETH_DMACRXDTPR in the last section of your comment, which talks about RBU. And, by the way...

    RM0433 Rev 7, "Channel status register (ETH_DMACSR)", page 2924:

    Bit 7 RBU: Receive Buffer Unavailable

    ... To resume processing Rx descriptors, the application should change the ownership of the descriptor and issue a Receive Poll Demand command. If this command is not issued, the Rx process resumes when the next recognized incoming packet is received. ...

    So it turns out writing a ETH_DMACRXDTPR is not mandatory. This fact should also be properly documented in the following sections of RM0433 Rev 7:

    1. "58.4.1 DMA controller", "DMA reception", step 11 at page 2813.
    2. "Figure 779. Receive DMA flow", exit of "Supsend Rx DMA" state, page 2814. Also fix the spelling of "Supsend".
    3. "58.9.4 Performing normal receive and transmit operation", step 3, page 2872.

    And, of course, the same is true for the respective sections of other H7 reference manuals - RM0399, RM0455, RM0468.

    @Imen DAHMEN​, another fix and improvement for the H7 reference manuals.

    Explorer
    August 31, 2022

    Want to like your reply. But by my testing seems to indicate ETH_DMACRXDTPR must written following RBU.

    My method:

    • Dimension 3 rx buffers; because my buffer size is less than MTU and so RBU occurs often.
    • Write 0 to ETH_DMACRXDTPR, in ETH_DMARxDescListInit, which is called by HAL_ETH_Init, and don't write it anywhere else.

    My observations:

    • DHCP acquires an address ok.
    • On pinging my device, it replies once or twice then throws "Device unreachable" and its Ethernet rx never recovers.

    Whereas if I add writing 0 to ETH_DMACRXDTPR in HAL_ETH_BuildRxDescriptors, when one or more rx buffers have been added the next rx descriptors and their OWN set, the Ethernet rx is continuously reliable (albeit lower performance) with the 3 rx buffers.

    My software:

    Have you tested "the Rx process resumes when the next recognized incoming packet is received. ..." after RBU without writing ETH_DMACRXDTPR yourself or have any other ideas to share? Thanks