Skip to main content
Graduate II
January 23, 2024
Question

USART Receive Interrupt

  • January 23, 2024
  • 10 replies
  • 11050 views

Hopefully this is more of a sensible non-noob question, it relates to interrupt handling, in my case a simple USART receive interrupt. 

I can see how the CubeMX tool is managing the .c file, its created an interrupt handler, for me to insert my code, thats easy. I can detect the interrupt and spit out a short string in response, so its working - all good. 

The question I have is, what can I do inside that ISR safely.  Do I need to preserve any registers. My aim is to throw anything I receive into a ring buffer of some form, with a view to building a received packet of data.  At some point I will need to pick up each completed packet from a list. In order to do that I will possibly need to allocate some memory, and it would be useful if I do not have to put the whole implementation of that ISR in the file that is being auto-generated, so would be handy to also be able to call a function where I can contain the implementation of the ISR in its own compilation unit. 

One other question is that of controlling interrupts.  At some point, I would have a packet of data in a buffer somewhere that I would access from the main loop of the code. In order for me to safely access that packet from the receive buffer, I would need to first, stop/pause/prevent the next interrupt from interrupting me. On x86 architecture this is quite tricky as you have to rely on the atomic behaviour of specific instructions, and prevent the interrupt in a way that, should an interrupt occur while getting the data safely from the buffer, the interrupt controller will hold/queue that interrupt until the interrupts are re-enabled. 

What is the right semantics for doing this on the STM32 platform?

As ever, any help at all much appreciated. 
Gerry

    This topic has been closed for replies.

    10 replies

    Graduate II
    January 23, 2024

    And on the above, is it a correct statement to say the following about this code:-

     

    • The code in the green box, is non-renentrant, that is to say, if an interrupt fires here from usart3, while its doing work but before it calls the read again on line 213, it is gaurenteed that another interrupt from the same source will not re-enter here?

    • Is it safe, required, advisable, or non-advisable to disable the USART3 interrupt in this block while processing the interrupt, and then re-enabled it before calling HAL_UART_IRQHandler()?

    Hope this makes sense...
    Gerry 

    gerrysweeney_0-1706036415454.png

     

    Super User
    January 23, 2024

    Didn't see this post before my 1st response

    DON'T DO THAT!!!!  Never call any blocking function from an interrupt handler.  It may/will delay the interrupt handler long enough to miss subsequent interrupts (i.e. miss incoming data).

    TO answer your question - once the UART3 interrupt handler starts it will not get called again until the handler exits.  It may immediately start that same handler again if there is still a pending interrupt.  A different interrupt can interrupt this handler if it has a higher priority.

    Do not "disable" the interrupt.  It is effectively disabled by the NVIC interrupt priority structure that only allow higher priority interrupts (lower priority values) to interrupt an active interrupt handler.

    Graduate II
    January 23, 2024

    I will not :) I was just doing it for testing, I was pretty well aware that what I was doing in that test code was bad, but it let me see that the interrupt was working :) thats all

    Thank you for the clarification on the semantics of the reentry, thats what I was expecting/.hoping, but in essence I think what you are saying is, its on me to make sure that the ISR completes in enough time to be done *before* the next character arrives and fires the interrupt again, otherwise I can miss chars... 

    No hardware FIFO's like there was in the 1980's 16550's UARTS then :(

    Thanks,
    Gerry

    Super User
    January 23, 2024

    Do as little as possible in the interrupt handler (including HAL callback functions if you use HAL).  You COULD have the ISR put the new byte into a buffer and check for whatever "end of message" indicator you might have, then set a global variable (declared "volatile") to signal that data is ready.  But the ISR cannot overwrite that buffer with additional data until it has been processed - so you either need ping-pong buffers (2 or more buffers that the ISR cycles between), or the one buffer needs to be big enough to hold multiple "messages".

    With the STM architecture, an arguably better solution is it use DMA into a (relatively) large circular buffer.  You non-interrupt code then periodically checks that buffer and processes any data that it contains.  See https://github.com/MaJerle/stm32-usart-uart-dma-rx-tx

    Either way - be aware that the HAL UART RX code (both interrupt and DMA) will abort the receive if it gets any UART error, like framing errors, parity (if you have it enabled), or overrun (which should NOT happen when using DMA).

    Graduate II
    January 23, 2024

    Hi Bob,
    Yes I generally would to be honest, what I am trying to understand though is what can I safely do.  I cannot seem to find any reference to that.  

    Best I understand it, I will be receiving chars of the IUART.  The start of frame begins with a period greater than one character of all 1's which I understand the UART can/will generate an interrupt on. If that is the case, then I would have my serial stream framing, but, if not, then I need to use a timer to receieve characters until there is a period of not receiving characters any more. I am hoping the USART/Interrupt controller will do the work for me here, the timer solution sounds horrible. 

    Gerry

    Graduate II
    January 23, 2024

    @TDK thank you for the clarifications, thats very clear, non-reneterent, thats helpful to know. The second part of my question was this.  If I use the ISR to stack my received chars into a buffer, at some point, say setting a variable, I need to indicate I have received a frame of data, and the main loop will need to come and pick up that data, which will involve copying the message out of the buffer, and moving a pointer to allow that part of the buffer to once again be free for the ISR to use.  

    During the time I am messing with the buffer form the main loop, how do I ensure that the ISR does not fire again and change the buffer that I am currently copying?

    And one other question, how do I know which registers its safe to mess with during my ISR, for example, if I used memcpy thats going to use some registers for src/dst/len param, that I presume for an ISR would be bad right?

    Super User
    January 23, 2024

    > During the time I am messing with the buffer form the main loop, how do I ensure that the ISR does not fire again and change the buffer that I am currently copying?

    Generally, you should write your accesses such that the ISR firing in the middle of something is okay, but you could also disable the interrupt during a time to prevent this from happening. HAL_NVIC_DisableIRQ/HAL_NVIC_EnableIRQ could be used here to suspect a particular interrupt.

    A ring buffer can be written in such a way that the ISR pushes data while the main loop pops data, with neither interfering with each other.

    > And one other question, how do I know which registers its safe to mess with during my ISR, for example, if I used memcpy thats going to use some registers for src/dst/len param, that I presume for an ISR would be bad right?

    Core registers? Typically those aren't manipulated at the C code level. But you don't need to preserve anything. Core register (r0, r1, r2, etc...) are pushed before the ISR starts. It takes something like 12 cycles for this to happen. They are restored when the ISR exits.

    Graduate II
    January 23, 2024

    I have to deal with 8 or more UARTs on a regular basis.  I setup a FIFO queue for each UART and all I do in the ISR is enqueue the received message on the queue.  Then the non-interrupt part of my code looks for a queue length > 0.  I use the STM32H7A3 which has a character match function so the ISR doesn't get called until the UART detects the end of line character I've defined, almost always a LF.  I haven't tested the CM functionality with non-DMA but since the CM interrupt is in the UART, I'm guessing it will work.  When I was using a STM32 processor without CM, I set up the UART to interrupt every character.  When I got a character, I just stuck it in an array.  When the received character matched my end of line character, I enqueued the character array on the FIFO.

    I know a lot of people will yell about interrupting on every character being inefficient, but these processors are so fast it isn't a problem for my moderately real time requirements.

    Here's a good reference with example code for setting up a FIFO. https://www.techiedelight.com/queue-implementation-cpp/

    BTW, most STM32 processors do have FIFOs attached to the UARTs that you can setup in CubeMX.

    I understand your point about not wanting to modify the auto-generated code too much.  But if you put your code between the designated comments in the autogenerated code, CubeMX will preserved your code when it regenerates.  Also you should check to see if the IRQ handlers that CubeMX generates are declared as weak so you can override them.  In my case, they aren't so I just put a call to my IRQ handler in the autogenerated IRQ handler code between the "user code here" comments and I don't have to touch the autogenerated code.  I do have to declare my IRQ handler as external to make this work. 

    Full disclosure, I"m not a very experienced embedded programmer so don't take anything I saw as gospel.

     

    Graduate II
    January 23, 2024

    Hi Magene,
    Thanks for you input.  I am very comfortable using std::queue but I am lost as to how I can do that safely.  The data I am trying to receive is some bytes (a few of them) followed by an IDLE frame.  So my plan broadly speaking was to collect the characters one by one on the ISR into a buffer, at the point I receive an IDLE state interrupt, I was going to transfer those bytes (the packet) to the queue.  Easy enough, and I expect if thats all I was doing, then it would work fine. 

    However, in the application part of the system I will need to periodically check the queue, and, on finding one or more packets of data in that queue, I will need to dequeue them and process them.  

    The problem I am strugllling with is I cannot see how I can access that queue and safely dequeue ensuring that tihngs do not go wrong because I receive another interrupt that puts something into that queue, while I am trying to take something out of it.   Any idea?

    Gerry

    Graduate II
    January 23, 2024

    Your question "how I can access that queue and safely dequeue ..." is very pertinent. I think you're talking about dealing with something like this.  

    1. The ISR enqueues a message on the queue and exits.

    2. The non-ISR part of your code detects something is in the queue and starts dequeuing.  

    3. Before the dequeue process is finished, the ISR fires again and enqueues a new message.

    A thread safe queue can handle this situation, it requires proper sequencing of when you change the value of the pointers to the head and tail of the queue.  I thought the link I gave you talked about this but I didn't see much.  If you google around for something like "thread safe" FIFO queue, you'll be able to educate yourself about how to implement a queue that can deal with the above.

    And BTW, the std::queue is immensely resource intensive so I use a much lighter weight version I wrote based on the link I gave you.

    Graduate II
    January 24, 2024

    And googling for "lock free Queues" is helpful also. Here's one link that might be useful 

    https://moodycamel.com/blog/2013/a-fast-lock-free-queue-for-c++

     

     

    Graduate II
    January 24, 2024

    Thanks for the link. I am quite familiar with threading and lock-free queues etc, but these implementation require the use of threads and synchronisation natives like mutex's etc... under the hood, these syncronization objects are provided by the operating system, which in turn use very specific characteristics of the CPU and specific atomic operations in order to function reliably.  None of this applies when programming bare metal in the absence of a scheduler. 

    An interrupt routine is not a thread, its a lot like one conceptually, but, in practice you have to 100% rely of very specific characteristics of the CPU and/or interrupt hardware in order to ensure there is controlled access to critical data structures. 

    So in the case of the STM32 what I was asking is, what is the *correct* way of ensuring that if a mess with a buffer pointer during an interrupt that I am not breaking other application that was in the middle of messing with the same pointer value in memory at the time the interrupt fired.  The solution here cannot be a generic lock free thing, it really has to be something very very specific to the way in which the STM processor/interrupt controller hardware works. 

    I spent many hours yesterday evening trawling the internet looking for examples of how to do this, and I am so surprised that there is so little information out there, STM32 parts are very popular, yet there appears to be a total lack of information out there...even on GitHub where you can generally find most things. 

    Case in point, I want to implement a simple ISR to read chars of a UART and place them into a buffer, I want to do that on interrupts, its a slow connection (9600 baud) so should be a breeze for q 180Mhz 32 but part, so interrupts per char should be absolutely find.  In my main code I want to **safely** access that same buffer while the interrupts are potentially still firing.  This is such a common use case, its truly remarkable how none of the ST documentation or examples appear to show how to do this.  

    I was always told that STM32/ARM stuff has a very high barrier to entry, and I am starting to understand why that is now.  The HAL is one way of doing things, yet examples around the net seems to showing 100 different ways of doing the same thing.   

    Other parts I have worked with (ESP32, Microship PIC18/32 for example, even Atmel (no Microchip too) parts) are so much better documented, with so many more decent examples and explanations, the barrier to entry is far less. 

    A few years ago I worked on a project that used an ESP32, this is a Chinese part, and at the time was new and barely documented, yet even that was 1000x simpler to start working with, the examples where good, and the software supplied was part of the Patform.io ecosystem.  Within hours I was writing C++ code and quite a complex application, about 16,000 lines of code, all of the elements done what they said on the tin, and most importantly there were good working examples to draw from. 

    Having tried for a few days (on and off) I was thinking this morning that I may just give up on STM32 parts at this point. Too difficult to do the basics, you seem to have to know an aweful lot of stuff that does not appear to be documented.  Unless I am looking in the wrong places, but I have really tried...

    Super User
    January 24, 2024

    Perhaps show some code with what you think will break. In general, your ISR and your main thread shouldn't be changing the same pointer variable at all.

    In the circular buffer example, have a pointer to the head of the buffer and the tail. The ISR updates the head pointer, the main thread updates the tail.

    None of this is STM32-specific. This is all applicable to general coding in C with threads.

    There is the LDREX and STREX instructions, but these shouldn't be needed in general. You can use LDREX/STREX to implement a mutex object, but again, shouldn't be needed for a circular buffer implementation.

    Graduate II
    January 24, 2024

    Hi TDK,

    Thats fair, I have not started to code the ring buffer implementation, I agree its generally generic and you are right that the ISR will change the head pointer, the main app will change the tail pointer.  But at some point, it is required that the ISR should ensure there is no overflow, and more typically the main app will need to check if there is data to read, the way it will do that is check if (tail < head) but also change the wrap around too, in this case those checks most definitely will not be atomic, and so you could end up where you read some value which is wrong.  Of course you can design such that you can guarantee that the buffer is big enough that things are much less likely to go wrong, but generally speaking its better to have guarantees that at those critical moments there is some predictable and understood atomicity.

    The pointer at the LDREX/STREX is useful, has led me here, I think this is pretty much what I was looking for, that gives me enough to work on I expect: https://devblogs.microsoft.com/oldnewthing/20210614-00/?p=105307

    I will share the code if I run into a problem, as it is now I am struggling to get the Interrupts to fire predictably, in fact I can predict with 100% certainty that, if I handle the IDLE state interrupt, and turn on a LED each time it triggers, and I trigger it by simply pressing a key in the terminal I have connected to USART3 port, I would expect the IDLE interrupt to trigger after every key press, but what it does is trigger (very reliably) only after every second keypress.  Seems a bit weird, I must be doing something wrong, but I really cannot see what.  The lack of any examples is the killer here, I have no point of reference, and am not quite at the stage of breaking out the scope/logic analyser to try and reverse engineer what the UART is doing... 

    I read somewhere that USART3 and USART4 share the same interrupt so I am wondering if it might have something to do with that, but to be honest with you, its a bit like poking around in the dark - quite frustrating, I cant even find any official documentation that provides anything like the levels of technical details one might need to understand in detail how the USART actually works.  The MCU data sheet tells you what it is, and what features it has, but thats it, I must be looking in the wrong places - and thats after hours of googling...  


    Thank you
    Gerry

    Graduate II
    January 24, 2024

    @TDK 

    Ahh thank you, well I was looking for the details reference manual and I could not find it...not even under the technical documentation section for the part.  The Chip I am using is the STM32F429ZI on the Nucleo 144 eval board.  

    Can I ask how you located that document, so I can try and find the same one for the chip I am using please?

    I think you are right about the HAL, the abstraction seems a bit weird for sure. I found a post on Stack Overflow that explains that when you called the HAL_UARTx_xxxx_IT function to receive, you have to tell it how many bytes, and under the hood it maintains a counter, once that count reaches zero it disables the interrupt again, that sounds totally dumb to me. 

    I don't suppose you know where I might get an example of how do enable interrupts and implement the ISR without the HAL?

    Thanks again for your help TDK...

    Gerry

    Super User
    January 24, 2024

    > Can I ask how you located that document, so I can try and find the same one for the chip I am using please?

    Personally, I just google "stm32f429 reference manual" and it typically comes up.

    It should also be listed on the part page. Navigate to Documentation and then to Reference Manuals

    https://www.st.com/en/microcontrollers-microprocessors/stm32f429zi.html#

    TDK_0-1706121395465.png

     

     

    > I don't suppose you know where I might get an example of how do enable interrupts and implement the ISR without the HAL?

    This one is linked quite a bit, but uses DMA.

    https://github.com/MaJerle/stm32-usart-uart-dma-rx-tx

    I don't know of a non-DMA based one, but I'm sure they are out there. If I find one later I'll link it here.

    Graduate II
    January 29, 2024

    Just wanted to post an update, I finally got this working, in the end I ended up using HAL DMA for reading and writing, I implemented is (really really) simple slot based circular buffer, sends and receives are on a message by message basis, I got everything I needed to finally understand how the HAL likes to work for DMA based sends and receives from this video.  As with all these things, its pretty simple when you know how. 

    There is still problem that accessing the circular buffer would not pass the basic sniff test in a multi-threaded environment, there is, as far as I can perceive still edge cases that could go wrong when accessing the variables that control the cirular buffer head/teal, if interrupted mid-stream.  In practice though, so long as your reader process is able to to read and process content out of the buffer, faster than your incoming data stream can out data in, then you can basically ignore that possibility.  I have yet to have a good sense of the performance of the microcontroller because I am not yet able to measure it (my Agilent digital scope let out the magic smoke a year or so ago, still on the list to fix), but my incoming data stream is only at 9600baud, I am simple ceiving a messge, doing a crc check, wrapping in a header and second CRC and sending out over the USB serial at 115200baud so my ring buffer is barely exercised, the MCU has pleanty of horse power. 

    Anyway, please see link to video. If you want to read serial data with IDLE detection this is how you do it. 

    Thanks to everyone who answered my questions on this thread. 

    https://www.youtube.com/watch?v=RpDOHqoVNTs


    Super User
    January 30, 2024

    As long as there is only 1 "writer" and 1 "reader", and if the IRQ only modifies the "head" pointer (writing data), and the non-IRQ code only modifies the "tail" pointer (reading data) there should be no race conditions or edge cases and it should be RTOS compatible.  The non-IRQ code needs to only update the "tail" pointer when it is done - no intermediate values.  Do all manipulations in a temporary copy then write when done.  Tilen Majerle has an example on his github site https://github.com/MaJerle/lwrb

     

    Graduate II
    January 31, 2024

    Hi Bob,

    I understand the microcontroller is basically single threaded, and an interrupt is simply a suspension of what is currently processing and a crude context switch (i.e call an ISR function), I get that, but the point I am making above is, your C code generally translates to a larger number of machine code instructions.  Forgive the fact this example is in x86 code (same thing applies), that is a simple function to get the next slot in a simple 16-slot ring buffer. The vast majority of these instructions are NOT atomic, and certainly the registers can easily be re-used.  So if you take that example code, it could very easily be that in your main loop, you are executing this code, and the line where you are setting eax register to the value contained in  the memory variable MainBuf_tail. An intterupt arrives so your program stops immediately after that instruction line, and the CPU calls your ISR.  In your ISR code you do some other work that involves using the eax register, so its value is changed.  Upon the ISR completion, your codo resumes at the next instruction in the code below where ecx is being set. The line after that compares ecx with eax, but eax now has a different value in it then it did before the ISR was executed.  At that point, at best this code will give the wrong result, at worst it could crash. 

    So it does depend on the CPU, but as a general rule, updating a pointer (which is in a ram memory location) will almost always have temporary variables, if of those are just registers like shown in this example. 

    All CPU's that are thread capable, or multi-code, have some very specific and well documented instructions that are gaurunteed to be atomic, basically uninterruptible, these are typically register-to-register value exchange or register -memory-to-register operations. On CPU architectures that are not inherently design with multi-core in mind, will have some well defined way of safely executing ISR's, generally this means the ISR should save all and any registers its going to mess with, typically on the stack, and then when complete restores the register values before returning from the ISR. 

    I was asking about the ARM architechture because I do not know what is recommended, and in C, with quite a large number of registers at the disposal of libraries and so on, its not clear what registers should be saved/restored on entry/exit of the ISR. 

    Of course, people say, do as little as possible in the ISR, that makes timings better but also reduces the risk of the type of problem I mention above, but thats relying on hope rather than absolutes, which I would never do when programming for any professional development project.  

    When I figure out how to drive the debugger properly, for my application I will do it by reviewing the disassembly, work out what registers are changed and save/restore these.  

    For some background, back in the DOS/Early Windows era I developed a lot of pre-emptive background processes on x86, and that was a good time to learn that if I wanted my programs to not crash and be 100% reliable, even when threading was not a thing in DOS, interrupts on timers, serial and network events very much were, so you really had to pay attention to the state the CPU was in at all time. 

    It would seem reasonable to assume that the ARM core does not have some magical "save and restore all registers automatically when entering an ISR and restore them all back on exit"... I mean it might, and would be nice, but I have not found any documentation either way, which is why I was asking the question. 

    Hope that makes some sense. 

     

    gerrysweeney_0-1706660116039.png

     

    Graduate II
    January 31, 2024

    On further update, reading another article, its suggesting the ARM cortex does not its self save anything, appart from the stack and frame and instruction pointer registers, it appears to be left upto the C compiler to implement the prolog/epilog to save/restore registers.  The ABI used by the compiler seems to dictate which registers are/are not saved/restored.  So the answer would appear to be - consult the "C" compiler... 

    Fair enough...

    gerrysweeney_1-1706661670667.png

     

    Super User
    January 31, 2024

    It would seem reasonable to assume that the ARM core does not have some magical "save and restore all registers automatically when entering an ISR and restore them all back on exit"

    The Cortex-M architecture with its NVIC interrupt controller is relatively new. They took lessons from most things you've mentioned and the result is very efficient and elegant. Yes, it saves and restores the important  registers automatically. In short, interrupt handling works mostly intuitively. It does what you want for small embedded systems, and in the way you want it. Yes there is an ABI (few variants for different FPUs), provided by ARM.

     I have not found any documentation either way, which is why I was asking the question

    Come on, Mr. gerrysweeney. The documentation is abound, and now the chatgpt.

    Please use your time, read the Programmer's manuals, play with the code in debugger. There are some books, but who reads (and writes) books these days. Enjoy the simplicity, power and elegance of Cortex-M before it got messed with ugly complications - TZ, security and so on ))

    Also, interesting reading is on memory accesses ordering, memory barriers, "atomic" support in C and C++ (stdatomic.h and std::atomic). And the use of "volatile" /* old folks who believe they knew all about volatile, are up to surprise!*/