Skip to main content
Visitor II
April 25, 2016
Question

Questions surrounding __HAL_LOCK

  • April 25, 2016
  • 15 replies
  • 12416 views
Posted on April 26, 2016 at 00:12

I’m an engineer at Fluke, and we’re using an STM32F4xx seriesmicrocontroller (together with its HAL drivers) as the basis for a newproduct. The HAL contains a “locking�? mechanism, where eachsubsystem—I²C, USB, UART, and so on—has a “locking object�?. My team hasbeen working on the assumption that this mechanism’s goal is to providesome mutual exclusion for memory-mapped hardware registers and relatedstate so that, for example, an interrupt doesn’t modify the same hardware state being manipulated in the main “thread�? of exectution.

What’s confused us, however, is how this is done. The driver code found in stm32f4xx_hal_def.h defines the locking mechanism as follows:[1]

typedef
enum
{
HAL_UNLOCKED = 0x00U,
HAL_LOCKED = 0x01U
} HAL_LockTypeDef;
#if (USE_RTOS == 1)
/* Reserved for future use */
#error ''USE_RTOS should be 0 in the current HAL release''
#else
#define __HAL_LOCK(__HANDLE__) \
do
{ \
if
((__HANDLE__)->Lock == HAL_LOCKED) \
{ \
return
HAL_BUSY; \
} \
else
\
{ \
(__HANDLE__)->Lock = HAL_LOCKED; \
} \
}
while
(0)
#define __HAL_UNLOCK(__HANDLE__) \
do
{ \
(__HANDLE__)->Lock = HAL_UNLOCKED; \
}
while
(0)
#endif /* USE_RTOS */

In summary, “locking�? consists of setting a flag, or returning an error(indicating the system is busy) when the flag is already set.“Unlocking�? consists of unconditionally clearing the flag. What throws us is the immediate return of HAL_BUSY in the case ofcontention. If a HAL lock is providing mutual exclusion between the mainthread and an interrupt handler, what should be done if the interrupt isthe second caller of __HAL_LOCK and gets HAL_BUSY? An interrupt cannotpause its execution and start again when the lock is no longercontended, like a userspace thread could in a multitasking OS. (The bestone can do is have the interrupt schedule the work for some later time,but this is not always feasible.) Additionally, many read-modify-write sequences (such as ORing bits) are performed on hardware registers without any additional protection besides these ''locks''. What is to keep an interrupt (or, if one is running an RTOS, another task) from executing in the middle of one of these sequences? Based on these observations, we tried redefining __HAL_LOCK and__HAL_UNLOCK to be a global “interrupt lock�?, i.e., the former disablesinterrupts and the latter re-enables them. For cases where several callsto these functions nest (whether intentionally or unintentionally), wekeep track of a nesting count. Unfortunately this doesn’t work well withthe I²C driver, since it contains several code paths where, on atimeout, __HAL_UNLOCK is called twice for a single call to __HAL_LOCK,unbalancing our nesting count to disastrous effect. How is the __HAL_LOCK/__HAL_UNLOCK system meant to work? It would seemwe’re not using it according to its design goals. [1] All code is taken from STM32CubeF4, version 1.0 #hal #stm32f4
    This topic has been closed for replies.

    15 replies

    Visitor II
    April 26, 2016
    Posted on April 26, 2016 at 09:50 Hi Matt, It has been mentioned several times before here, the HAL_LOCK mechanism is badly broken. As you say, it doesn't work for interrupt handlers. What I have done for UART and CAN is copy code out of the HAL library into my own functions so that it can be called from an interrupt, for example, my HAL_CAN_RxCpltCallback(CAN_HandleTypeDef* hcan) function contains the following:

    // ideally, we should be able to call HAL_CAN_Receive_IT() here to set up for another
    // receive but the API is flawed because that function will fail if HAL_CAN_Transmit*()
    // had already locked the handle when the receive interrupt occurred - so we do what
    // HAL_CAN_Receive_IT() would do
    if(hcan->State == HAL_CAN_STATE_BUSY_TX)
    hcan->State = HAL_CAN_STATE_BUSY_TX_RX;
    else {
    hcan->State = HAL_CAN_STATE_BUSY_RX;
    /* Set CAN error code to none */
    hcan->ErrorCode = HAL_CAN_ERROR_NONE;
    /* Enable Error warning Interrupt */
    __HAL_CAN_ENABLE_IT(hcan, CAN_IT_EWG);
    /* Enable Error passive Interrupt */
    __HAL_CAN_ENABLE_IT(hcan, CAN_IT_EPV);
    /* Enable Bus-off Interrupt */
    __HAL_CAN_ENABLE_IT(hcan, CAN_IT_BOF);
    /* Enable Last error code Interrupt */
    __HAL_CAN_ENABLE_IT(hcan, CAN_IT_LEC);
    /* Enable Error Interrupt */
    __HAL_CAN_ENABLE_IT(hcan, CAN_IT_ERR);
    }
    // Enable FIFO 0 message pending Interrupt
    __HAL_CAN_ENABLE_IT(hcan, CAN_IT_FMP0);

    Note that I am not using interrupts for CAN tx, just receive. I hope this helps. Cheers, Mark
    Visitor II
    April 26, 2016
    Posted on April 26, 2016 at 13:00

    Hi Matt Kline,

    Thanks for your feedback and contribution. It is a known limitation and a new LOCK and UNLOCK process under preparation. Sorry for the inconvenience that may bring

    -Hannibal-
    Visitor II
    May 14, 2018
    Posted on May 14, 2018 at 17:05

    Any news here? I do get the feeling that the forum is a black hole where user-feedback, and even solutions as below, are dumped in to rot . Even if it _seems_ like it is actually addressed by moderators: As there is no public bug-tracker no one can trace the issue. This is very frustrating, and I got feedback from other developers that they stopped reporting issues due to this.

    Graduate
    May 14, 2018
    Posted on May 14, 2018 at 17:16

    That's write, no bug tracker, this weak lock implémentation has been continued in the several following cube implémentation.

    Why not hosting the code in a repository, I'm still wondering.

    Visitor II
    May 1, 2016
    Posted on May 01, 2016 at 17:32

    Hello Friends,

    this

    __HAL_LOCK(__HANDLE__)

    implementation is ... let's call it NAIVE. we can only trust the LOCK when calling from one main thread. If we call LOCK from main thread and from interrupt we have the well known data conflict. So we need to make sure never mixing LOCK from main and from ISR. Especially when we use HAL functions in Callback functions from HAL_***_IT or HAL_***_DMA we need kind of thread safety. One simple solution would be to use the ARM exclusive LDREX STREX here. Something similar as below

    #define __HAL_LOCK(__HANDLE__) \
    do
    { \
    if
    ((__LDREX(&((__HANDLE__)->Lock) == HAL_LOCKED) \
    { \
    return
    HAL_BUSY; \
    } \
    else
    \
    { \
    if(__STREX(HAL_LOCKED, &((__HANDLE__)->Lock)) != 0) return HAL_BUSY; \
    } \
    }
    while
    (0)

    Please ST comment and implement some more failsave version of HAL_LOCK. Regards, Adib. --
    Graduate
    March 10, 2017
    Posted on March 10, 2017 at 11:51

    I've just check in the STM32F4Cube V1.14 and the code has not changed.

    Yes the actual code is not thread-safe.

    I think your solution as a good one, we could also use a SWI if we can implement it on these MCU (I come from ARM7TDMI world .

    But your suggestion is the best one.

    Is there a way to create issue, or suggest patch on ST code ? I did not find out where to do it, I also found an I2C API error and would like to suggest a fix.

    Technical Moderator
    May 18, 2017
    Posted on May 18, 2017 at 12:33

    Hello

    LIBERADO.Selso

    ,

    You can post your issue/patch in this STM32 forum.

    ST moderators will take in charge your request.

    Thanks

    Imen

    Graduate II
    May 1, 2016
    Posted on May 01, 2016 at 18:05

    let's call it NAIVE

    There's a lot of stuff in there that's ill conceived, and that's been apparent to anyone with any multi-threaded and multi-tasking experience, for several years. 

    Visitor II
    September 24, 2016
    Posted on September 24, 2016 at 09:25

    It sounds like we are saying that that we cannot call any ST library code that invokes HAL_LOCK from within an interrupt routine.

    In my USB case, I need to call HAL_PCD_EP_Receive() after unpacking data from the USB buffer so that more data can be received. In order not to massively impact the data transfer rate, I should do this at the end of the USB Receive Interrupt routine.

    But if a USB Receive Interrupt happened at a time that HAL_LOCK was ''locked'' (e.g. my main code was queueing some data for transmit), then the HAL_PCD_EP_Receive will not happen (i.e. will be HAL_LOCKed out) and the USB bus is stalled.

    What is the recommended workaround for this?

    When can we expect a fix?

    Visitor II
    February 7, 2017
    Posted on February 07, 2017 at 01:37

    I have exactly this problem with USB receive.  The HAL_Lock logic is hopelessly broken and it's littered throughout the ST libraries causing goodness knows what other race conditions.  I am also interested in a solution...

    Visitor II
    September 25, 2016
    Posted on September 25, 2016 at 19:09

    >we cannot call any ST library code that invokes HAL_LOCK from within an interrupt routine.

    Might we go so far as to say the the USB Receive Interrupt is a pointless and futile waste of time?

    If all we can do is set a flag in the interrupt and poll for it in our User thread, then why would we bother? Why wouldn't we just poll the USB peripheral instead.

    There must be something that I'm missing here, because the scheme that seems to be forced upon developers by the HAL libraries appears, at best, poorly considered, and more realistically...well you can see where I'm going with this.

    Richard

    Visitor II
    August 10, 2017
    Posted on August 10, 2017 at 19:23

    I think you all have the correct idea - Not to call certain functions inside an interrupt routine, and I agree on that.

    Firstly, the HAL drivers can not know what is your interrupt priorities settings are, You might try to call a function that depends on a interrupt that has a lower priority than the function you are calling from. Your code might then be stuck in an endless loop.

    Secondly, it is not good coding practice to handle messages in interrupt routines. Your messages might take too long to be decoded and/or serviced and then you have the next interrupt pending of lower priority in worst case, whilst still servicing the previous message.

    So, the best practice would be to set a flag (declared as volatile) in the interrupt and handle the execution of the interrupt in the main loop. I know that is not always desirable, but it is definitely the safest. I also want to say that in some instances, if the process could be done very swiftly, you can handle the interrupt servicing inside the interrupt. For instance setting up the next buffer to be handled by the DMA on the I2S audio interface.

    The questions to answer are thus:

    1. How are the interrupt priorities structured

    2. Can the interrupt servicing be moved to the main loop or not.

    3. How many cycles are required in the servicing on the interrupt if it is to be handled within an interrupt.

    Hope it helps..

    Graduate
    November 28, 2017
    Posted on November 28, 2017 at 17:17

    Hello !

    I've been experiencing issue on I2C because I'm having concurrent access from interrupt routine and main loop.

    I use STMCUBE 1.

    I've made a unit test to check the HAL_LOCK function :

    1. using default implementation
    2. replacing it by IT activation/désactivation (__disable_irq(), __enable_irq() ).
    3. using solution ldrex/strex from

    On that test I implement a IT routine (on TIMER2 expiration) and execute a function in main loop.

    Each have separate counter when lock succeded (supposed) and access a global counter as the critical section (and comparaison).

    Result :

    • On the default implementation I get different values on counter : 1/1000 relative error
    • The IT activation/desactivation result in no error as expected.
    • the ldrex/strex implémentation does not give expected result : 1/1000 relative error.

    About the intrinsic __ldrex/__strex Keil

    http://www.keil.com/support/man/docs/armcc/armcc_chr1359124997htm

    : .

    Because memory accesses can clear the exclusive monitor, code using the

    __ldrex

    and

    __strex

    intrinsics can have unexpected behavior. Where

    LDREX

    and

    STREX

    instructions are needed, ARM recommends using embedded assembly.

    So this explanation is sufficient to me to explain why the last solution did not work.

    I also read on some

    https://community.st.com/thread/infocenter.arm.com/help/topic/com.arm.doc.dht0008a/DHT0008A_arm_synchronization_primitives.pdf

    that we must perform a DMB operation to ensure a consistent memory view.

    I worked on a assembly code that uses ldrex/strex operation comptatible with GCC. This is my first assembly code I coded it basing on the default implementation listing and ARM documentation :

     uint32_t hallock(volatile uint8_t * pLock){uint32_t llock;uint32_t lBusy;uint32_t lFailed;__ASM volatile( 'mov%[rBusy], #1\t''ldrexb %[rlock], %[rplock]\t''cbnz%[rlock], busy\t''strexb %[rFailed], %[rBusy], %[rplock]\t''cbnz%[rFailed], busy\t''dmb\t' /* ensures that all subsequent accesses are observed after the gaining of the lock is observed */'mov%[rBusy], %[rlock]\t' /* lock contenait unlock=0 */'bxlr\t''busy:\t''mov%[rBusy], #2\t' /* renvoyer HAL_BUSY */'bxlr\t': [rplock] '+Q' (*pLock), [rFailed] '=&r' (lFailed), [rlock] '=&r' (llock),[rBusy] '=&r' (lBusy):);// Never reach this point : avoir a warning 'no return'.return lBusy;}void halunlock(volatile uint8_t * pLock){uint8_t llock;__ASM volatile('mov %[rlock], #0\t''dmb\t' /* ensures that all subsequent accesses are observed after the gaining of the lock is observed */'str %[rlock], %[rplock]\t': [rplock] '=Q' (*pLock), [rlock] '=&r' (llock):);}#undef __HAL_LOCK#define __HAL_LOCK(__HANDLE__) do { if( hallock( &(__HANDLE__)->Lock) != HAL_OK) return HAL_BUSY ; } while(0);#undef __HAL_UNLOCK#define __HAL_UNLOCK(__HANDLE__) do { halunlock( &(__HANDLE__)->Lock ) ; } while(0) ;#endif�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?

    Visitor II
    November 28, 2017
    Posted on November 28, 2017 at 17:55

    Hello Selso,

    I have not really understood, how you use the exclusive memory access (

    ldrex/strex)

    for ressource allocation.

    Basically those are meant for use of concurrent memory access and generating lockfree code.

    Did you thought about raising the CPU priority in order to prevent certain interrupts from being activated?

    This has several advantages over disabling the interrupt.

    see this sample:

    ATOMIC_BLOCK(NVIC_GetPriority(TIM5_IRQn)) {

    time_us = rx_timestamp.utc_us[0] + rx_timestamp.sub_sec; if((rx_timestamp.locked != false) && (rx_timestamp.utc_us[1] != 0)) { rx_timestamp.utc_us[0] = rx_timestamp.utc_us[1]; rx_timestamp.utc_us[1] = 0; } rx_timestamp.locked = false; }

    You need to assign the CPU priorities more carefully do.

    hope this helps,

    Adib.

    --

    and this is how it is implemented

    /**

    @file atomic_stmh @brief implemented atomic function for STM32 controller similar as avrlib for atmel controller

    @see discission on

    https://community.st.com/external-link.jspa?url=http%3A%2F%2Fwww.mikrocontroller.net%2Ftopic%2F312246%23new

    */

    #ifndef INC_ATOMIC_STM32_H

    #define INC_ATOMIC_STM32_H

    /**

    block all ISR with same or higher priority number, @param prio system priority as used in NVIC_GetPriority() call @note the internal CMSIS command __set_BASEPRI_MAX can not use prio directly but must shift by the not implemented prio levels */ #define ATOMIC_BLOCK(prio) for( uint32_t __cond = 1, __prio = __get_BASEPRI(); \ __cond != 0 ? (__set_BASEPRI_MAX(prio << (8U - __NVIC_PRIO_BITS)), 0) : 0, __cond != 0; \ __set_BASEPRI(__prio), __cond = 0 )

    /*

    static uint32_t BEGIN_ATOMIC_BLOCK(uint32_t prio) { uint32_t __prim = __get_BASEPRI(); uint32_t prioritygroup = NVIC_GetPriorityGrouping(); uint32_t encodedprim = NVIC_EncodePriority(prioritygroup, prio, 0); uint32_t newbasepri = (uint8_t)((encodedprim << (8U - __NVIC_PRIO_BITS)) & (uint32_t)0xFFUL); __set_BASEPRI_MAX(newbasepri); return __prim; }

    static void END_ATOMIC_BLOCK(uint32_t prio) { __set_BASEPRI(prio); }

    */

    #endif // INC_ATOMIC_STM32_H

    Graduate
    November 29, 2017
    Posted on November 29, 2017 at 18:32

    This is the aim of the assembly code :

    1. load and check lock;
    2. try to set it
    3. if ok : call memory barrier operation
    4. otherwise return 'busy'

    Blocking IT is not a solution because it's not generic implémentation.

    Using C code with intrinsic is not reliable because the compiler does not garantee that the exclusive monitor is not cleard, So one shall code assembly.

    The example I gave do not work when the code is inline because it ships with the 'bx lr' instructions, I did a 'inline code' that performs well' :

    __attribute__((always_inline)) __STATIC_INLINE uint32_t hallock(volatile uint8_t * pLock)
    //uint32_t hallock(volatile uint8_t * pLock)
    {
    uint32_t llock = 1;
    uint32_t lBusy = 1;
    uint32_t lFailed = 1;
    __ASM volatile (
    'ldrexb %[rlock], %[rplock] 
    \t'
    'cmp%[rlock], #0 
    \t'
    'it eq 
    \t'
    'strexeq %[rFailed], %[rBusy], %[rplock] 
    \t'
    'cmp%[rFailed], #0
    \t'
    'it eq 
    \t'
    'dmbeq 
    \t' /* dmb : ensures that all subsequent accesses are observed after the gaining of the lock is observed */
    : [rplock] '+Q' (*pLock), [rFailed] '+r' (lFailed), [rlock] '+r' (llock),[rBusy] '+r' (lBusy)
    : /* no input */
    : 'memory'
    );
    return (lFailed == 0) ? HAL_OK : HAL_BUSY;
    }
    __attribute__((always_inline)) __STATIC_INLINE void halunlock(volatile uint8_t * pLock)
    //void halunlock(volatile uint8_t * pLock)
    {
    uint8_t llock = 0;
    __ASM volatile(
    'dmb
    \t' /* ensures that all subsequent accesses are observed after the gaining of the lock is observed */
    'str %[rlock], %[rplock]
    \t'
    : [rplock] '=Q' (*pLock)
    : [rlock] 'r' (llock)
    : 'memory'
    );
    }
    #define __HAL_LOCK(__HANDLE__) do { if( hallock( &(__HANDLE__)->Lock) != HAL_OK) return HAL_BUSY ; } while(0);
    #define __HAL_UNLOCK(__HANDLE__) do { halunlock( &(__HANDLE__)->Lock ) ; } while(0) ;
    �?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?

    Graduate
    November 30, 2017
    Posted on November 30, 2017 at 12:17

    Il would just add that str and strex shalle respectively modified to strb and strexb for storing a byte and not a word.

    Graduate
    January 15, 2018
    Posted on January 15, 2018 at 15:25

    Hello !

    According to you MCU model you are compiling the code for a cortex-M0, that do not s

    https://developer.arm.com/docs/dui0497/latest/the-cortex-m0-instruction-set/instruction-set-summary

    https://developer.arm.com/docs/dui0553/latest/2-the-cortex-m4-processor/22-memory-model/227-synchronization-primitives

     

    I'm sorry that implementation cannot fit your need, I'm also learning so I missed that point.

    Another lock implementation could be made with

    https://developer.arm.com/products/architecture/a-profile/docs/dht0008/latest/a-swp-and-swpb/a1-legacy-synchronization-instructions/a11-swp-and-swpb

    , but it is also not supported by these new family of Cortex.

    The code shall also be changed because of the unsupported 'it' instruction.

    For your case I don't see anything but masking IT.

    Visitor II
    January 15, 2018
    Posted on January 15, 2018 at 15:52

    Hello thank you very much for your kind reply!

    I did mask the interrupts and everything works fine now.

    I also understood that

    fixing __HAL_LOCK would not solve main program / irq synchronization for HAL drivers. Only IRQ disable during lock and re-enable in unlock would solve.

    In fact, my problem was with CAN rx interrupts never re-enabled after a while, and I discovered it was due toCAN irq occurring during the call to HAL_CAN_Transmit_IT.

    In this case, masking IT is the only solution:HAL_CAN_IRQHandler andCAN_Receive_IT functions don't care __HAL_LOCK - of course - and mess with state in a totally unguarded way.

    I have solved it masking IT before calling HAL_CAN_Transmit_IT and reenabling them:

    static void canEnterCritical(void){
    NVIC_DisableIRQ(CEC_CAN_IRQn);
    __asm volatile( 'dsb' ); // Synchronization of the pipeline to guarantee
    __asm volatile( 'isb' ); // that the protection against interrupts is effective
    }
    static void canExitCrititcal(void){
    NVIC_EnableIRQ(CEC_CAN_IRQn);
    }
    �?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?

    So, generally speaking, fixing __HAL_LOCK with a more robust code would not solve the weaknesses of HAL drivers.

    Things could be different with an RTOS (e.g. FreeRTOS), but then you have RTOS' locks, so __HAL_LOCK/UNLOCK could be rewritten using RTOS locks instead of assembly code.

    --Vito.