Skip to main content
Explorer
February 22, 2024
Question

DMA Transfer Complete Flag strange behavior

  • February 22, 2024
  • 3 replies
  • 4524 views

Hi, 

I am using 3 DMA channels of DMA1 on a STM32G4* to transfer data from the 3 ADCs to mem. At a certain point I want to make sure that all transfers are done and enter a while loop:

 

 

uint32_t timeoutCnt = 0;
while(!LL_DMA_IsActiveFlag_TC1(DMA1) || !LL_DMA_IsActiveFlag_TC2(DMA1) || !LL_DMA_IsActiveFlag_TC3(DMA1))
{
 if( ++timeoutCnt > 400 ) // 3 load-and-compare operations per cycle -> roughly guessed 70ns; wait for max 28µs
 {
 // error handling
 }
}
LL_DMA_ClearFlag_TC1(DMA1);
LL_DMA_ClearFlag_TC2(DMA1);
LL_DMA_ClearFlag_TC3(DMA1);
// go on with work...

 

 

Usually the DMAs should be done when execution gets here and in rare cases it may take some 100s of ns to complete. 

This works most of the time. Sometimes the while loops gets stuck, that is one of the TC flags never gets set. There is no straight forward way to trigger the failure. Some software builds seem more susceptible and some hardware boards are more prone to show the error. Most builds and most boards are totally imune and run for days (>>10^9 passes) without problem. This indicates, that some subtle timing problem may be involved. (SW build moves the code around and different HW means that the timing of external interrupts and xtal frequency is slightly different.)

I can think of 2 reasons:

  1. Some other event clears the TC bit before I do my test. As mentioned, in most cases the DMAs are long completed when I check them, so there would be ample time for "something else" to clear them. I just have no idea what "something else" could be. 
  2. The fast polling of the TC flag (and the bus traffic on AHB1 resulting from this polling) stalls the DMA transfer and actually keeps the TC from becoming set. Some sort of bus deadlock. But ADC is on AHB2 and the bus matrix uses a Round Robin arbitration, so I see no reason why the DMA should become stuck. 

Any ideas or hints woud be highly appreciated!

Update: I am aware of the global clear bit (CGIFx) in DMA_IFCR - and I am very sure, that I never use this bit on DMAs in my code. 

    This topic has been closed for replies.

    3 replies

    Super User
    February 22, 2024

    Probably a subtle code bug. Unlikely to be a silicon issue. Perhaps the transfer never gets started. Consider using a real timeout timer instead of a software based one, although I think the way you've written it is okay. Consider increasing the timeout to see if it eventually passes. Consider logging the start of each ADC transfer.

    There are "while (!flag);" loops done all the time, this is probably not going to cause hardware issues. How much bandwidth are the ADCs using?

    strygaAuthor
    Explorer
    February 22, 2024

    Thanks for your suggestions. ADCs are triggered every 60 to 100 µs, adc-clk is 42Mhz and sample time is 47.5 cy, translating to 1.12µs, so the SAR should have ample time to finish. 
    Well, in the beginning we had no timeout counter and back then the devices just stalled. So, I am quite sure that the situation never heals on its own. 

    The ADCs are triggered from a timer through trgo. I do not modify the timer config after init, so the trgo should be reliable. The code testing the completion is triggered through the same timer but then -> DMA (other channel but also DMA1) -> SPI -> DMA1 (again other channel) -> TC-interrupt. 

    Would you be aware of any "interference" between the different channels of DMA1? 

    Super User
    February 22, 2024

    > The code testing the completion is triggered through the same timer but then -> DMA (other channel but also DMA1) -> SPI -> DMA1 (again other channel) -> TC-interrupt. 

    Makes me wonder if you're clearing flags incorrectly in the other channel. Are you doing an improper read-modify-write to clear flags on the SPI side?

    Following @waclawek.jan's suggestion would likely show the issue.

    Super User
    February 22, 2024

    When interrupt occurs, read out and check/post content of ADC and relevant DMA/DMAMUX registers.

    JW

    Graduate II
    February 22, 2024

    What's the ADC buffer size?
    Assuming your count to 400 takes 400 * 3 cycles, that's just about 7 µs at 170 MHz.
    Maybe your check sometimes gets called directly after a new DMA transfer was started?

     

    Is it always the same DMA channel that gets "stuck"?

    Do you actually do anything where your comment "// error handling" is?

    In case of failure I would check which DMA channel is still active, set a flag, break the loop, and so on...

     

    strygaAuthor
    Explorer
    February 26, 2024

    Thank you for the input.

    Waiting for longer doesn't help. We had 2000 wait cycle for some time, no difference. 
    The logging says that usually all 3 channels are stuck - strange. I have to double check how I log it. 
    Error handling means going to safe state as long as we have no clear understanding what happens here. Technically we could just go on and the next ADC cycle has good chances to complete without error. Still. "cleverly ignoring" the error doesn't feel like a solution. 

    Graduate II
    February 26, 2024

     > "cleverly ignoring" the error doesn't feel like a solution.

    Haha, that's a good one, and I'm absolutely with you!