Skip to main content
Graduate II
February 6, 2020
Question

[BUG] STM32 HAL driver lock mechanism is not interrupt safe

  • February 6, 2020
  • 12 replies
  • 13210 views

STM32 HAL driver library is full of flawed and sub-optimal constructs. The most common one, which impacts almost all drivers, is the lock mechanism. It's a bad and limiting design and getting rid of it requires a major rewrite, but the worst fact is that it's not even interrupt safe and therefore doesn't provide locking for which it was introduced. The current __HAL_LOCK() (reformatted) code looks like this:

#define __HAL_LOCK(__HANDLE__) \
do{ \
	if((__HANDLE__)->Lock == HAL_LOCKED) \
	{ \
		return HAL_BUSY; \
	} \
	else \
	{ \
		(__HANDLE__)->Lock = HAL_LOCKED; \
	} \
}while (0U)

Between testing and setting the ->Lock an interrupt can happen and also test and set the ->Lock. Therefore both - main thread and interrupt - will continue execution as if the object was unlocked and the interrupt will unlock it before the main thread has completed it's "locked" part, which makes it even more prone to next interrupt calls. The same will happen when higher priority interrupt interrupts lower priority interrupt. Additionally the ->Lock variable is not marked as volatile and therefore is prone to reorder by compiler optimization.

The proposed fix is simple and requires adding of only a few lines of code in stm32XXxx_hal_def.h files for all STM32 series:

#define __HAL_LOCK(__HANDLE__) \
do { \
	uint32_t rPriMask = __get_PRIMASK(); \
	__disable_irq(); \
	if ((__HANDLE__)->Lock == HAL_UNLOCKED) { \
		(__HANDLE__)->Lock = HAL_LOCKED; \
		__set_PRIMASK(rPriMask); \
	} else { \
		__set_PRIMASK(rPriMask); \
		return HAL_BUSY; \
	} \
} while (0)
 
typedef volatile enum {
	HAL_UNLOCKED = 0x00U,
	HAL_LOCKED = 0x01U
} HAL_LockTypeDef;

Thus this will make that bad construct at least interrupt safe and actually provide locking as was intended.

Note that __HAL_UNLOCK() code doesn't need modifications as it already is atomic.

    This topic has been closed for replies.

    12 replies

    Visitor II
    November 11, 2022

    Dear All,

    Based on the different gathered feedbacks from this Forum and other feedbacks sources and the full analysis of the different calls to the HAL_Lock() and the issues mentioned regarding this topic shows that the __HAL_LOCK() and __HAL_UnLOCK() are not used always as "standard" lock mechanism for critical sections properly but rather as a special state machine to reject launching same HAL processes in several statements in the current HAL , thus the following updates have been introduced on the HAL to fix the issue related to this topic.

    1. Reject launching same HAL processes (ongoing):

    Fix

    • Rely rather on the native state machines rather than the HAL_LOCK/HAL_UNLOCK

    Example:

    HAL_StatusTypeDef HAL_UART_Transmit_IT(UART_HandleTypeDef *huart, const uint8_t *pData, uint16_t Size)

    {

       if (huart->gState == HAL_UART_STATE_READY)

       {  ...}

    }

    2- Protect changing the common state machine for several processes (Already implemented on uart and full deployment ongoing):

    Fix: have a state machine per independent process

    Example:

    typedef struct __UART_HandleTypeDef

    {

     (…)

     HAL_LockTypeDef          Lock;                   /*!< Locking object                    */

     __IO HAL_UART_StateTypeDef   gState;             /*!< UART state information related to global Handle management

                                                             and also related to Tx operations. This parameter

                                                             can be a value of @ref HAL_UART_StateTypeDef */

     __IO HAL_UART_StateTypeDef   RxState;            /*!< UART state information related to Rx operations. This

                                                             parameter can be a value of @ref HAL_UART_StateTypeDef */

    } UART_HandleTypeDef;

    3 - Protect checking and modifying the state machine by locking the check and set statement within lock mechanism based on __LDREXH / __STREXH (only series based on CM0 core come with an implementation around the enable/disable irq: (will be deployed on next HAL releases):

    #define HAL_CHECK_AND_SET_STATE(__HANDLE__, __PPP_STATE_FIELD__, __PPP_CONDITIONAL_STATE__, __PPP_NEW_STATE__)  \

     do {                                                      \

      do{                                                     \

       /* Return HAL_BUSY if the status is not ready */                              \

       if (__LDREXW((__IO uint32_t *)&(__HANDLE__)->__PPP_STATE_FIELD__) != (uint32_t)(__PPP_CONDITIONAL_STATE__)) \

       {                                                     \

        return HAL_BUSY;                                             \

       }                                                     \

       /* if state is ready then attempt to change the state to the new one */                  \

      } while(__STREXW((uint32_t)(__PPP_NEW_STATE__), (__IO uint32_t *)&((__HANDLE__)->__PPP_STATE_FIELD__)) != 0); \

      /* Do not start any other memory access until memory barrier is complete */                 \

      __DMB();                                                   \

     }while(0)

    4 - Protect common processes register update (Already implemented):

    Fix:

    Add new macros in the CMSIS device files for atomic bit and registers modifications (based on __LDREXH / __STREXH (only series based on CM0 core come with an implementation around the enable/disable irq)

    § ATOMIC_SET_BIT(REG, BIT)                           

    § ATOMIC_MODIFY_REG(REG, CLEARMSK, SETMASK)                        

    § ATOMIC_SETH_BIT(REG, BIT)                          

    § ATOMIC_CLEARH_BIT(REG, BIT)                        

    § ATOMIC_MODIFYH_REG(REG, CLEARMSK, SETMASK)                  

    Example:

    HAL_StatusTypeDef HAL_UART_Receive_DMA(UART_HandleTypeDef *huart, uint8_t *pData, uint16_t Size)

    {

     (…)

           /* Enable the UART Receiver Timeout Interrupt */

           ATOMIC_SET_BIT(huart->Instance->CR1, USART_CR1_RTOIE); => ATOMIC access for the USART_CR1_RTOIE bit

       (…)

    }

    Thanks and Regards

    Maher

    Graduate
    December 18, 2023

    Hello @MMAST.1,

    Can you share the yearly update on this topic ?

    I'm still struggling with very rare bug on a module <-> mcu communication based on.
    I'm using raw C, no OS, no multi-thread. 

    TX (in Cube MX: Preemption Priority 2 UART and DMA):

    HAL_UART_Transmit_DMA() wait for a flag set by

    HAL_UART_TxCpltCallback()

     

    RX: (in Cube MX: Preemption Priority 2 UART and DMA, DMA Circular, Overrun: Disable, DMA on RX Error: Enable)

    HAL_UARTEx_ReceiveToIdle_DMA()

     

    I also implemented a ErrorCallback like this:
    void HAL_UART_ErrorCallback(UART_HandleTypeDef *huart)
    {
    HAL_UART_AbortReceive(&huart2);
    HAL_UART_DeInit(&huart2);
    // MX_USART2_UART_Init();
    // Ringbuf_Init(); // restart HAL_UARTEx_ReceiveToIdle_DMA(&UART, aRXBufferUser, RxBuf_SIZE);
    }
    As you can see I first tried with to restart in the ErrorCallback but was still facing some very rare lock (my timeout (implemented so that if response is not received within x seconds, the module is turn off and will retry next loop) is not firing, my code just hang somewhere, so I decided to just try to stop is a clean way, benefit from the timeout (applicative driven using a custom counter decreased in SysTick_Handler()) and restart next loop.
    At this point I don't know if this code is still prone to issue of lock and I'm not even able to tell if this is related to HAL_LOCK mechanism described here but at least I can tell that the code in my latest version (stm32l0xx_hal_def.f) is the same as the one said to be buggy in 2020 :p

     

    #define __HAL_LOCK(__HANDLE__)
    	do{
    		if((__HANDLE__)->Lock == HAL_LOCKED)
    		{
    		return HAL_BUSY;
    		}
    		else
    		{
    		(__HANDLE__)->Lock = HAL_LOCKED;
    		}
     }while (0)
    
    #define __HAL_UNLOCK(__HANDLE__)
    	do{
    		(__HANDLE__)->Lock = HAL_UNLOCKED;
    	}while (0)

     

    More detailed about my issue here

    Thank you for any help