Skip to main content
Super User
September 27, 2018
Question

[Synopsys OTG] FIFO clash

  • September 27, 2018
  • 6 replies
  • 2828 views

So where exactly does the documentation say that different FIFOs can't be accessed simultaneously?

In dervice mode, an In endpoint was configured to transmit one 16-byte-long packet, and then 16 bytes (i.e. 4 words, 72 43 70 A4 D9 99 55 2E 19 7F 32 37 13 EB B6 7E) were being written to the endpoint's Tx FIFO, in "main" (and instrumentation and logging confirmed they were indeed written to the FIFO all, as 4 words), when the USB interrupt kicked in and it read from the Rx FIFO. The result on the bus was:

0690X000006C5xoQAC.png

So what we see here? The USB core decided to transmit the first 5 bytes, 72 43 70 A4 D9, and appended the CRC calculated from the first 4 bytes of these 5. Naturally, host refused to ACK this. Core retried 2 more times, and the apparently gave up, inserting a zero-length DATA0 packet at 475 607 517ns (not expanded on that view, sorry). Then the USB core skipped 3 bytes (I'd say, in the first case it took 2 words, but used only one byte of the second word) and then transmitted one byte, with the CRC of 0 bytes (i.e. 0000). Rinse and repeat. Then core threw transfer complete interrupt and everything proceeded as usually.

I've seen similar patterns with the data packet containing 13 bytes, with CRC of the first 12 of them.

Preventing the USB interrupt to kick in solved the problem entirely; and also helped selectively disabling the particular interrupt resulting in reading from Rx FIFO (and not running any other In endpoint i.e. writing to some other Tx FIFO).

The "factory" firmware fills FIFOs only in the interrupt so it won't be hit by this. This is nice, but IMO working more by accident than by deliberate design, given the documentation doesn't appear to talk about conflicting FIFO accesses. In fact, there's nothing which would indicate, that the individual endpoints are not completely independent from each other.

I don't say this wasn't fun to find, but I am supposed to work and not having fun the whole day.

JW

    This topic has been closed for replies.

    6 replies

    Explorer II
    January 22, 2025

    @waclawek.jan wrote:

    Preventing the USB interrupt to kick in solved the problem entirely; and also helped selectively disabling the particular interrupt resulting in reading from Rx FIFO (and not running any other In endpoint i.e. writing to some other Tx FIFO).



    Can you expand a bit on this? I am debugging an issue where we are putting 6 bytes into TXFIFO and I see on the wire 5 bytes with a bad CRC.

    Super User
    January 22, 2025

    I was writing TxFIFO in "main" (i.e. low interrupt priority context) and reading RxFIFO in the USB interrupt context.

    What happened was, that while having written TxFIFO partially from "main", the USB interrupt kicked in with Rx data available and I read them out, and this resulted in the faulty pattern (CRC) when subsequently transmitting those partially written data from TxFIFO onto the bus.

    My solution was to disable interrupts (causing RxFIFO being read) while writing TxFIFO.

    Another possible solution (which may be the one used by Cube/HAL, I don't know, I don't use Cube/HAL) is to access all FIFOs only within the USB interrupt context; in that way each individual FIFO is always written/read entirely, with no interruption from other FIFO accesses.

    This issue may be the consequence of the Transmit data FIFO is corrupted when a write sequence to the FIFO is interrupted with accesses to certain OTG_FS registers erratum, which appeared in the 'F407 errata on 21-Feb-2023 (note date of my original post above), so slightly different albeit related reason, nonetheless the corollary is the same.

    Note, that what I wrote above pertains to the OTG in 'F4xx (I used both 'F407/427 and 'F446 and I don't remember which one was the one which threw me). The versions of OTG in individual STM32 models do differ in version, thus in details of their behaviour too.

    Note, that I am not ST (although that's probably obvious).

    JW

     

    Explorer II
    January 22, 2025

    Thanks for the reply. I am using F723. I am writing to the tx fifo in main and believe the SOF interrupt may be the culprit. I am writing two 32 bit words with 6 bytes of data and two bytes of padding. I am seeing 5 bytes and a bad CRC. Sounds very much the same issue. As soon as I wrapped the two FIFO word writes in __disable_irq() and __enable_irq() the problem disappeared.

    Super User
    January 22, 2025

    SOF sounds to me like an unprobable candidate, but of course it depends on details.

    Do you access any of the endpoint registers in the SOF interrupt?

    You can try, instead of global interrupt disable/enable, to selectively disable only the SOF interrupt in GINTMSK (and perhaps reading it back just to be sure it's disabled before accessing FIFO) during the FIFO writes; if the problem is removed solely by this step, then it's indeed likely to be SOF-related.

    JW

    Graduate
    January 22, 2025

    The rule of thumb is: Do NOT do anything in "main". ;)

    Explorer II
    January 22, 2025

    Not sure what is causing it yet but it seems to line up with SOF. We aren't doing work based on an interrupt currently.

    Super User
    January 22, 2025

    > Is there even an ISR for IN Token on STM32F7?

    The Synopsys OTG aims at more hardware support than other USB modules traditionally in the microcontrollers. In other words, the core tries to hide the physical transactions (such as tokens) from you and provide a "smooth" buffered data-stream experience. With more or less success that is...

    JW

    Explorer II
    March 22, 2025

    I am still battling with this. I don't quite 100% understand what is going on but it seems like this errata is worse than I first understood. I am seeing corrupt packets because of EP1 and EP0 FIFO clashing. The usb stack may put something into the EP0 fifo in anticipation of say a string descriptor that windows may only ask for a lot later. It seems almost like you would need to lock access until a transfer is complete across endpoints. Only can have one endpoint active at a time?

    Even if all the calls are in the same ISR. Once the FIFO has been written isn't it possible that something happens to it during subsequence register access before it is transmitted? Like it's put it in the buffer and is waiting for the IN token.

    Super User
    March 23, 2025

    Only can have one endpoint active at a time?

    This can actually have sense because USB is serial and everything is serialized on  two wires: RX or TX, for any endpoint. Nothing actually runs in parallel.  

    Explorer II
    March 23, 2025

    This can actually have sense because USB is serial and everything is serialized on  two wires: RX or TX, for any endpoint. Nothing actually runs in parallel.

    This is conceptually different than serializing the FIFO access though isn't it? I don't see STM doing so in their codebase. Are you saying that STM32 serializes different endpoints between fifo write and transfer complete?

    Explorer II
    March 23, 2025

    I rewrote the area of the firmware so that the txfifo write for EP1 is in the same IRQ handler and there is no re-entry. Still the data in EP0 txfifo is corrupted! Do I need to serialize the endpoint FIFO writes? Seems nuts. Maybe either of you have an idea or is this errata worse than declared?