Skip to main content
niklas2
Associate
July 11, 2017
Question

DMA+USART on STM32F407VG: TC Interrupt sometimes not triggered

  • July 11, 2017
  • 3 replies
  • 5909 views
Posted on July 11, 2017 at 17:53

Dear community,

I have run into a DMA-related issue while trying to implement an application for the STM32F407VG which receives data from a sensor via UART. The sensor sends a 162-byte data packet at 912600 baud every 10ms with pauses in between. Because the MCU is already quite busy, i want to use DMA. Since I have no way to stop/start the sensors' transmission, and the MCU may start up after the sensor and I want the whole system to be hot-pluggable, I have to find the beginning of each data packet. If I just enable the DMA for USART reception, configure it to 162 bytes and process the result in the transfer complete interrupt, I might end up receiving the end of a packet which is currently being transmitted and the beginning of the next packet as one data block.

To solve this, I use the USART's IDLE interrupt to recognize the pause between two packets and start the new DMA transfer. I use the DMA completion interrupt to determine that a packet has been completely received (and stored in memory). If the TC interrupt has not triggered before the next IDLE interrupt, I assume the data packet to be too short and discard it.

However, under certain conditions, sometimes (every few hundreds of packets), the DMA TC interrupt simply is not triggered. In the USART IDLE interrupt, I see that the DMA 'NDTR' register is zero (indicating a complete transfer), while 'LISR' is also zero (indicating that the TC interrupt is not pending). After resetting the DMA stream it works fine again for some time.

This behaviour seems to be influenced by the CPU load: I configured a timer to call an empty dummy ISR at a high frequency. The higher the frequency, the more TC interrupts go missing. This happens even though both the USART and the DMA interrupt have a higher priority than the timer interrupt.

Executing the code from RAM instead of flash results in many more missed interrupts too.

My suspicion is that this has something to do with a high (RAM) bus load, which would affect performance but should not lead to missing interrupts.

I have tried many variations but could not find a working constellation. I have uploaded an example code demonstrating the problem on

https://github.com/Erlkoenig90/INSReceive

. The interesting part is in the Src/main.c file (shortened):

static uint8_t rxBuffer [176] __attribute__ ((aligned (16)));
static DMA_Stream_TypeDef* const dmaStream = DMA1_Stream1;
static unsigned int state = 0;
static unsigned int printCounter = 0;
void USART3_IRQHandler (void) {
if (USART3->SR & USART_SR_IDLE) {
// Clear Interrupt via dummy read
(void) USART3->DR;
switch (state) {
case 0:
// First IDLE detected. Do nothing special.
break;
case 1:
// IDLE has been detected without a DMA interrupt. This should not happen.
printf ('Reception failed: NDTR = %lu, LISR = 0x%lx\n', dmaStream->NDTR, DMA1->LISR);
printCounter = 0;
break;
case 2:
// DMA Completion and IDLE has happened. A packet has been properly received.
if (rxBuffer [0] == 0xFA && rxBuffer [160] == 0x27 && rxBuffer [161] == 0x10) {
if (printCounter == 99) {
puts ('Received 100 packets OK');
printCounter = 0;
} else {
++printCounter;
}
} else
puts ('Packet received, but is invalid');
break;
}
state = 1;
// Disable DMA stream properly
dmaStream->CR = 0;
while ((dmaStream->CR & DMA_SxCR_EN) != 0);
// Clear Interrupt flags
DMA1->LIFCR = DMA_LIFCR_CTCIF1 | DMA_LIFCR_CHTIF1 | DMA_LIFCR_CTEIF1 | DMA_LIFCR_CDMEIF1 | DMA_LIFCR_CFEIF1;
// Make sure buffer is correctly aligned
uint32_t mptr = (uint32_t) rxBuffer;
assert_param (mptr % 16 == 0);
// (Re-)Initialize DMA
dmaStream->PAR = (uint32_t) (&USART3->DR);
dmaStream->M0AR = mptr;
dmaStream->NDTR = 162;
dmaStream->FCR = DMA_SxFCR_DMDIS;
dmaStream->CR = DMA_SxCR_CHSEL_2 | DMA_SxCR_PL_0 | DMA_SxCR_MSIZE_1 | DMA_SxCR_MINC | DMA_SxCR_EN | DMA_SxCR_TCIE;
USART3->CR3 = USART_CR3_DMAR;
}
}
void DMA1_Stream1_IRQHandler (void) {
if (DMA1->LISR & DMA_LISR_TCIF1)
state = 2;
}
// Dummy Timer ISR to simulate high workload
void TIM8_UP_TIM13_IRQHandler () {
if (TIM13->SR & TIM_SR_UIF) {
TIM13->SR = ~TIM_SR_UIF;
__NOP ();
}
}
int main(void) {
// ... The usual initialization ...
puts ('Application startup');
// Configure interrupts
HAL_NVIC_SetPriority (TIM8_UP_TIM13_IRQn, 1, 1);
HAL_NVIC_EnableIRQ (TIM8_UP_TIM13_IRQn);
HAL_NVIC_SetPriority (USART3_IRQn, 0, 1);
HAL_NVIC_EnableIRQ (USART3_IRQn);
HAL_NVIC_SetPriority (DMA1_Stream1_IRQn, 0, 0);
HAL_NVIC_EnableIRQ (DMA1_Stream1_IRQn);
// Enable peripheral clocks
RCC->APB1ENR |= RCC_APB1ENR_TIM13EN;
RCC->AHB1ENR |= RCC_AHB1ENR_DMA1EN;
RCC->APB1ENR |= RCC_APB1ENR_USART3EN;
// Initialize TIM13 to call the interrupt at 50kHz, which simulates some dummy load
TIM13->PSC = 83;
TIM13->DIER = TIM_DIER_UIE;
TIM13->CR1 = 0;
TIM13->SR = ~TIM_SR_UIF;
TIM13->ARR = 19;
TIM13->CR1 = TIM_CR1_URS;
TIM13->EGR = TIM_EGR_UG;
TIM13->CR1 = TIM_CR1_CEN;
DBGMCU->APB1FZ |= DBGMCU_APB1_FZ_DBG_TIM13_STOP;
// Initialize UsART3 for reception
USART3->BRR = 46;// 921600 Baud.
USART3->CR1 = USART_CR1_UE | USART_CR1_RE | USART_CR1_IDLEIE; // Only enable IDLE interrupt
while (1) {
__WFI ();
}
}�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?

An example output is:

Application startup

Received 100 packets OK Received 100 packets OK Received 100 packets OK Reception failed: NDTR = 0, LISR = 0x0 Received 100 packets OK Received 100 packets OK Received 100 packets OK Reception failed: NDTR = 0, LISR = 0x0 Reception failed: NDTR = 0, LISR = 0x0 Received 100 packets OK Received 100 packets OK

The output is different each time the code is run. The problem also occurs when i remove the (slow) printf statements. The whole thing seems to be quite erratic and elusive...

Does anyone have an idea as to what I am doing wrong or maybe a workaround that still allows robust operation when the sensor and MCU are randomy hotplugged?

Thank you very much in advance!

#interrupt #issue #stm32f4 #dma #usart
This topic has been closed for replies.

3 replies

waclawek.jan
Super User
July 11, 2017
Posted on July 12, 2017 at 00:18

I don't have an explanation for the behaviour you are experiencing, but I wonder how comes it won't choke forever in the

DMA1_Stream1_IRQHandler()

, as you don't clear the interrupt-triggering flag there and this is the highest priority interrupt...?

JW

niklas2
niklas2Author
Associate
July 12, 2017
Posted on July 12, 2017 at 01:29

Thanks for the reply! Good point, I don't know. Changing the ISR to

void DMA1_Stream1_IRQHandler (void) {
if (DMA1->LISR & DMA_LISR_TCIF1) {
state = 2;
// Disable DMA stream properly
dmaStream->CR = 0;
while ((dmaStream->CR & DMA_SxCR_EN) != 0);
// Clear Interrupt flags
DMA1->LIFCR = DMA_LIFCR_CTCIF1 | DMA_LIFCR_CHTIF1 | DMA_LIFCR_CTEIF1 | DMA_LIFCR_CDMEIF1 | DMA_LIFCR_CFEIF1;
}
}�?�?�?�?�?�?�?�?�?�?�?�?

doesn't change the behaviour, though.

I also tried removing the DMA ISR (and the DMA_SxCR_TCIE flag) alltogether, and just check for DMA_LISR_TCIF1 in the USART ISR. That doesn't help either - it works most of the time, but sometimes TCIF1 just stays at 0 even though NDTR is 0 as well.

waclawek.jan
Super User
July 17, 2017
Posted on July 17, 2017 at 13:18

Good point, I don't know.

Then revert and find out. When a substantial question arises, changing code in hope of things getting resolved  magically usually won't help - or, worse, results in 'works for me' kind of solutions.

If the code is not stuck in that ISR, then the ISRs don't have the intended priorities, or the ISR in question is not fired for some reason at all; or there is some other code with higher polling/nesting priority which clears/resets the DMA (a fault handler, perhaps, or some zealous debugger), or something else I can't guess.

Cut the code to bare minimum; avoid libraries;check all relevant registers by reading back; use pin toggles and a LA to follow the actual code flow.

JW

waclawek.jan
Super User
July 17, 2017
Posted on July 17, 2017 at 14:54

Humm.

It would never occur to me to use the FIFO and then a different transfer size than (you have 162 bytes i.e. 2 outstanding bytes). Sounds much like a silicon bug, but to prove that it would need a cleaner and selfcontained example...

Meantime you might want to ask for support through official channels - distri/FAE, web contact form.

JW

niklas2
niklas2Author
Associate
July 17, 2017
Posted on July 17, 2017 at 21:54

Oh. You're right. According to p. 315 of the reference manual, transfer sizes must be a multiple of MSIZE. I configured my sensor to send 164 bytes instead of 162 and everything works fine even with FIFO enabled. Kind of evil that problems occur only rarely if that condition is not satisfied.

Anyways, thanks a lot, i wasted way to much time on this...

waclawek.jan
Super User
July 17, 2017
Posted on July 17, 2017 at 22:15

Please try to set the FIFO threshold to 1/2 (i.e. 2 words = 8 bytes), while still having 164 bytes (i.e. not-an-integer-multiple-of-8) to transfer.

I am willing to bet that the problem reoccurs.

JW

Vangelis Fortounas
Associate II
July 22, 2017
Posted on July 22, 2017 at 21:22

Hello!!

the whole situation 'smells' frame error.

Did you check for this?

At speed 921,6k  when you have fPCLK8 MHZ  the actual speed is 888.88k

Add also the 10 ppm from ordinary crystals plus the jitter of an hi speeded main PLL.

Take a look at  RM  page 984, ....

Tesla DeLorean
Guru
July 22, 2017
Posted on July 22, 2017 at 21:54

The APB clock in question is 42 MHz, so the serial clock error isn't nearly that bad

Tips, Buy me a coffee, or three.. PayPal VenmoUp vote any posts that you find helpful, it shows what's working..
waclawek.jan
Super User
July 24, 2017
Posted on July 24, 2017 at 11:38

... and it really does not matter for DMA...

JW

(PS in my test, as it was a loopback within the same UASRT, the actual baudrate is not the least relevant thus FE can't occur)