Skip to main content
Explorer
July 16, 2025
Question

Interrupt curiosity (STM32G491)

  • July 16, 2025
  • 14 replies
  • 1319 views

Hello friends : )

I've recently tested the NUCLEO-G491RE board for an upcoming redesign.

Current design relies on quite tough interrupt latency so here is where I begun.

As I understand, with no FPU usage (in isr) (ASPEN + LSPEN = 0) one could expect up to 12 SYSCLK latency.

In this case 75ns at 160MHz which sounds pretty decent (today it's about 73ns).

 

I started off by creating two very similar interrupt services I intended to toggle between.

Each would pulse a output pin and trigger the other one:

Attributes: 'interrupt' + optimize("-O2")' + 'section(".RamFunc")' + 'aligned(8)' + 'naked'

static void onTimeStampEvent(void)
{
 GPIOA->BSRR = 1<<12;
 NVIC->ISPR[1] = 1<<(39-1*32); // USART3
 GPIOA->BRR = 1<<12;
#ifdef NAKED
 __ASM volatile ("BX LR":::);
#endif
 return;
}
 
static void onReceiveEvent(void)
{
 GPIOB->BSRR = 1<<14;
 NVIC->ISPR[0] = 1<<(11-0*32); // DMA1_CH1
 GPIOB->BRR = 1<<14;
#ifdef NAKED
 __ASM volatile ("BX LR":::);
#endif
 return;
}

In main the usual suspects:

  • HAL and System initiation
  • Peripheral initiation
  • Setting up interrupts

Finally, the main loop:

 u = 0;
 while(1)
 {
 NVIC->ISPR[0] = 1<<(11-0*32); // DMA1_CH1
 u++;
 }

 

This works splendidly, sort of...

<PicoScope shot 1>

The total time for a complete round trip is about 381ns or 61 SYSCLK.

Variable "u" in main never changes from "0" suggesting expected continuous interrupts.

The thing is, would not tail-chaining occur?

 

Now I tried diversify priority levels, yellow being less important:

<PicoScope shot 2>

Priority in action, indeed, however now a round trip takes 562ns (90 SYSCLK).

Still no tail-chaining, and worse, lots of extra time for the same amount of work.

 

Where have I done wrong?

Any help appreciated = )

/Hen

 

    This topic has been closed for replies.

    14 replies

    Explorer
    July 30, 2025

    I finally got some in-system test result - just above 30ns spread.

    That is excellent, however the full implementation may knock that smile off later.

    Soon the holiday season ends, and we may start this endeavor.

    Thank you for all inputs.

    Explorer
    August 3, 2025

    Continues to find answers...

    I think I've measured or registered something else in 4. above, last post by July the 26th.

    Cannot reproduce and now gets --- 52ck vs 69ck, which removes the "???".

    4. stack in 0x2xxx code in 0x1xxx --- 52ck vs 69ck.

    I swapped the priority, i.e letting the lesser important task take lead.

    5. stack in 0x2xxx code in 0x1xxx --- 52ck vs 72ck.

    That would mean different totals depending on the sources.

     

    It's still a hefty penalty having escalating interrupts than not.

    Well, it depends on perspective, I suppose, but 14 to 20 SYSCLK is notable in my book.

     

    BTW, somewhere in this rabbit hole I was not thinking clearly, talking of tail-chaining.

    This cannot happen in above examples, only waiting interrupts can. Sorry for that.

    Explorer
    August 6, 2025

    I think the nickel has came down.

     

    With same priority, i.e. no escalation, only one "enter/leave handler state" occurs (per source).

    Escalation yields the same for the more prioritized source, but not the other one.

    During escalation, the "enter/leave part" halts the other one, hence doubling it's penalty.

    Am I on to something or am I trapped in this mist (as usual)?

     

    BTW, this would minimize the spread on "The One", which I seem to recognize in system.

    Note, it's not the same code running in real, so there's room for differences.

    Super User
    August 7, 2025

    I don't quite understand what you are trying to say here, but as I've said above, the best strategy with Cortex-Mx (and modern 32-bitters in general), when it comes to hard real-time operation, is to strive to use purely hardware as much as possible. Factors which potentially impact interrupt latencies and jitter are just too many, and they tend to be poorly documented.

    JW

    Explorer
    August 10, 2025

    Yes, I agree on that point, relying on instruction feed with real time requirements is problematic.

    In our case the DMA takes the hard part, but the result has to be analyzed and ready before the next event.

    The 30ns spread is what the DMA usage offers.