Skip to main content
Visitor II
December 9, 2021
Question

STM32F4: Purpose of the usage of ATOMIC_SET_BIT ATOMIC_CLEAR_BIT macros in the low level drivers related to UART peripheral

  • December 9, 2021
  • 13 replies
  • 12584 views

Hello, I'm using LL driver in projects and I found these differences comparing the newest LL driver version (1.7.13) with my current why:

in file stm32f4xx_ll_usart.h/c (macro with prefix ATOMIC_ is now used in several functions)

  • LL_USART_EnableDirectionRx is now using ATOMIC_SET_BIT macro instead of SET_BIT
  • LL_USART_DisableDirectionRx is now using ATOMIC_CLEAR_BIT macro instead of CLEAR_BIT
  • LL_USART_EnableDirectionTx is now using ATOMIC_SET_BIT macro instead of SET_BIT
  • LL_USART_DisableDirectionTx is now using ATOMIC_CLEAR_BIT macro instead of CLEAR_BIT
  • LL_USART_SetTransferDirection is now using ATOMIC_MODIFY_REG macro instead of MODIFY_REG
  • LL_USART_EnableIT_IDLE is now using ATOMIC_SET_BIT macro instead of SET_BIT
  • LL_USART_EnableIT_RXNE is now using ATOMIC_SET_BIT macro instead of SET_BIT
  • LL_USART_EnableIT_TC is now using ATOMIC_SET_BIT macro instead of SET_BIT
  • LL_USART_EnableIT_TXE is now using ATOMIC_SET_BIT macro instead of SET_BIT
  • LL_USART_EnableIT_PE is now using ATOMIC_SET_BIT macro instead of SET_BIT
  • LL_USART_EnableIT_ERROR is now using ATOMIC_SET_BIT macro instead of SET_BIT
  • LL_USART_EnableIT_CTS is now using ATOMIC_SET_BIT macro instead of SET_BIT
  • LL_USART_DisableIT_IDLE is now using ATOMIC_CLEAR_BIT macro instead of CLEAR_BIT
  • LL_USART_DisableIT_RXNE is now using ATOMIC_CLEAR_BIT macro instead of CLEAR_BIT
  • LL_USART_DisableIT_TC is now using ATOMIC_CLEAR_BIT macro instead of CLEAR_BIT
  • LL_USART_DisableIT_TXE is now using ATOMIC_CLEAR_BIT macro instead of CLEAR_BIT
  • LL_USART_DisableIT_PE is now using ATOMIC_CLEAR_BIT macro instead of CLEAR_BIT
  • LL_USART_DisableIT_ERROR is now using ATOMIC_CLEAR_BIT macro instead of CLEAR_BIT
  • LL_USART_DisableIT_CTS is now using ATOMIC_CLEAR_BIT macro instead of CLEAR_BIT
  • LL_USART_EnableDMAReq_RX is now using ATOMIC_SET_BIT macro instead of SET_BIT
  • LL_USART_DisableDMAReq_RX is now using ATOMIC_CLEAR_BIT macro instead of CLEAR_BIT
  • LL_USART_EnableDMAReq_TX is now using ATOMIC_SET_BIT macro instead of SET_BIT
  • LL_USART_DisableDMAReq_TX is now using ATOMIC_CLEAR_BIT macro instead of CLEAR_BIT

So the question is, for what reason the macro with ATOMIC_ prefix is now used? Why only for UART peripheral? What these changes may affect?

    This topic has been closed for replies.

    13 replies

    Super User
    December 9, 2021

    > What these changes may affect?

    setting a bit or clearing a bit requires two memory accesses on ARM (not talking about bit-banding here): read a word + modify it + write the modified word. Concurrent modifications, e.g. from main and an interrupt handler of the same word may result in unexpected results (race conditions). This is fixed by using the ARM exclusive load/store instructions ldrex and strex, see https://developer.arm.com/documentation/dht0008/a/ch01s02s01.

    For the remaining questions, maybe someone from ST can answer.

    hth

    KnarfB

    Visitor II
    December 9, 2021

    It doesn't answers to the question, why it was added only for UART peripheral...

    Super User
    December 9, 2021

    He answered at least one of your questions.

    As to why these appear only in the UART functions - I can only guess. Maybe there are known issues with interrupts and non-interrupt code possibly changing the UART's CR at the same time, and there are not *known* cases of that in the other modules. Or maybe they (ST) haven't propogated this change elsewhere. Interestingly this doesn't appear in the F7 1.26.2 release. So I would guess this is just starting to propogate through the HAL ecosystem. But again, that is just a guess.

    What the changes may affect - looks like the intent is to make applications work as intended with no clobbering/corruption of the UART CRx bits. At least that is what it looks like to me.

    Super User
    December 9, 2021

    Strange. Maybe this was introduced for use in multicore systems where M4 is one of cores, such as STM32H7?

    The source itself does not offer any hints.

    Visitor II
    December 9, 2021

    Maybe, but these changes were made for STM32F4xx MCU's, which are single-core. Unfortunately I didn't found any explanation, why such changes implemented now.

    Super User
    December 9, 2021

    Do STREX/LDREX even work on direct peripheral addresses? Documentation from ARM indicates "memory" access.

    Certainly the changes would allow code to be thread-safe. It's unclear (to me at least) if these would work correctly on multi-core systems. I would tend to doubt it, as there would need to be a synchronization method between the cores to even allow it.

    Edit: STREX/LDREX does work on GPIO->ODR as expected. Tested on a STM32F405.

    Visitor II
    December 10, 2021

    Hi TDK, on multi-core system it shall work, because it was designed for it. But I'm not sure, if using these instructions does make sense on STM32F4, single core system...

    Super User
    December 10, 2021

    Please provide some information that backs up that statement.

    It absolutely makes sense to use on a single core, multi threaded application (taking interrupts as threads). I would argue that's the only place it makes sense.

    Explorer II
    December 10, 2021

    I asked myself the same question as soon as I see this change.

    On mono core is is used to do atomic access: thread vs interrupt.

    On multi core it is also used for atomic access and build higher level critical sections.

    It is strange to do a multicore protection at the bit level. I can't understand how 2 cores can share and access to the same peripheral at the bit level at the same time, even with STREXW/LDREXW.

    I would like to have a response from ST on the underlying reason for this change in architecture.

    Super User
    December 10, 2021

    > I can't understand how 2 cores can share and access to the same peripheral at the

    > bit level at the same time, even with STREXW/LDREXW.

    This is not at the "bit" level. This is at the byte/word/dword ADDRESS level. In a single core environment, the underlying mechanism simply makes sure that no interrupt/exception (or CLREX instruction) occurred between the LDREX and STREX. If an interrupt or a CLREX instruction is executed (both of which clears the "exclusive access" flag that is set by LDREX), the STREX will fail, signaling that the contents of that location MAY have changed. If no interrupt or CLREX instruction occurred between the LDREX and STREX, then the STREX will succeed and write the data to the given address because there is (almost**) no chance that some other code changed the memory/register value between the LDREX and STREX.

    ** I say "almost" because in PM0214 (ST Cortex M4 programming manual) there is no explicit mention of DMA access to the same memory location used by LDREX/STREX. I struggle to image a well-designed use case where DMA would be accessing the same memory/register, and at the same time, as LDREX/STREX.

    Explorer II
    December 10, 2021

    From what I have seen, in the HAL LDREX/STREX is used to change 1 or a few bits in a register. Hence my remark on the "bit level".

    In a driver shared between several cores, it is more common to find higher level exclusions (at the level of functions): configuration, sending or reception of a message ...

    Atomic accesses are used to build higher level critical sections.

    Super User
    December 10, 2021

    This gives some insight into how STREX/LDREX are implemented. This is Cortex-A, but it's likely the same for Cortex-M:

    https://developer.arm.com/documentation/den0013/d/Multi-core-processors/Exclusive-accesses

    In particular, this quote makes me believe this mechanism may or may not work between cores, depending on the hardware implementation:

    Where exclusive accesses are used to synchronize with external masters outside the core, or to regions marked as Sharable even between cores in the same cluster, it is necessary to implement a global monitor within the hardware system. This acts as a wrapper to one or more memory slave devices and is independent of the individual cores. This is specific to a particular SoC and might not exist in any particular system.

    Super User
    December 10, 2021

    @TDK​ ,

    > Edit: STREX/LDREX does work on GPIO->ODR as expected. Tested on a STM32F405.

    Out of curiosity: How?

    Thanks,

    Jan

    Super User
    December 11, 2021

    Set up a timer interrupt such that it spends very roughly 50% of the time in that interrupt and 50% in the main loop.

    Within the timer interrupt, toggle bit 0 and verify its status.

     ASSERT(!(GPIOA->ODR & GPIO_PIN_0));
     ATOMIC_SET_BIT(GPIOA->ODR, GPIO_PIN_0);
     ASSERT(GPIOA->ODR & GPIO_PIN_0);
     ATOMIC_CLEAR_BIT(GPIOA->ODR, GPIO_PIN_0);

    Within the main loop, toggle bit 1 and verify its status.

     ASSERT(!(GPIOA->ODR & GPIO_PIN_1));
     ATOMIC_SET_BIT(GPIOA->ODR, GPIO_PIN_1);
     ASSERT(GPIOA->ODR & GPIO_PIN_1);
     ATOMIC_CLEAR_BIT(GPIOA->ODR, GPIO_PIN_1);

    where ASSERT() just blocks forever if the condition isn't true.

    Run the code with a debugger, observe that all ASSERTS are met and the code never blocks.

    I also did verify that occasionally the STREX in the main loop would fail, so the check appears to be working as intended.

    Super User
    December 11, 2021

    Interesting. And when you use simple SET_BIT/CLEAR_BIT ASSERT bailed out?

    Visitor II
    December 11, 2021

    A read modify write onto a bit field of a regiater possibly shared by different thread should have atomic interrupt state saved, disabled and restored, if the same source is shared across different parts. When missing atomic causes application trouble, you will be debugging something not repeatable, random, occuring overnight.... so rmw atomic are better safe than sorry. If someone knows what he is doing, he can optimize and remove the "fat" consciously. No?

    No multicore experience though....

    Graduate
    December 12, 2021

    If this is now a two cycle instruction, then an interrupt can happen right in the middle of an instruction cycle. When using an RTOS, this can lead to very unwanted behavior. Making it atomic turns off interrupts so that the read/modify/write cycle that seems to be here cannot have an undesired context switch. Possibly has to do with AZURE RTOS and perhaps dual core processors?

    Super User
    December 12, 2021
    It very clearly does not turn off interrupts. Nor is there any indication that this is intended for dual core applications.
    Graduate
    December 12, 2021

    Then this is not the meaning of "ATOMIC" that I am familiar with. I'm used to something like

    "ATOMIC (option) 
    (
    Code with interrupts turned off
    }
     
     
     

    Super User
    December 12, 2021

    No, it is not. Turning interrupts off has global impact on the core (timing). One task/thread/irq takes ownership of the core and the others are starving. In contrast, ldrex + strex have only impact on the current task/thread/irq: the strex might fail. Often spin lock loops are built around that. So, if a race condition occured, the task/thread/irq repeats the request until done. This sounds fair, at least if there are many tasks/threads/irqs and the chance of a collision is low.

    hth

    KnarfB

    Visitor II
    December 12, 2021

    I agree with you Harvey. If the reference code must be compatible with all projects, of course it wilk have to give up some performance away for the sake of shorter debug experience with non expert embedded coders. In the end, if the code you develop goes into a resale product, coder will own the whole code that you probably personalized and optimised as the spec is known. Maybe it is bare metal, so atomic maybe removed. Maybe the code runs in non priviledged mode and it's ok for the coder.

    The key is to start with a safe code for new coders. Now the atomic could be done by the coder, and to me it is passing a know how and challenge upward, increasing the jitter coming from the WCET growing...