More USB Shenanigans : The Futility of EPENA.
This is a long one, so sit tight around the campfire, and let me tell you the tale of The Futility of EPENA.
Several days ago, I began to implement a Mass Storage interface for my own USB stack, equipped with a Bulk IN and OUT endpoint. The first thing I did was wait for the OTEPDIS interrupt ("OUT token received when endpoint disabled"); this lets me know that the host wants to send me a love letter, that is, a Command Block Wrapper. This OTEPDIS interrupt is a little strange, however, as RM0351 (Rev. 10) states that this interrupt only applies to Control OUT endpoints, but nonetheless seems to have effect for non-Control EPs (at least for the STM32L476RG).
Anyways, on this interrupt, I then configure the Bulk OUT endpoint to receive a single 8-byte packet in order to get the first 8-bytes of the 31-byte CBW. Of course, I could just have a 64-byte EP so it could be received in-all-one-go, but I wanted to ensure the flexibility of my own USB stack in being able to handle multi-packet transfers. After configuring OTG_DOEPTSIZx with the packet count (PKTCNT of 1) and byte count (XFRSIZ of 8), I set CNAK and EPENA for the Bulk OUT endpoint in OTG_DOEPCTLx.
Shortly thereafter, RXFLVL gets asserted, indicating there's a packet available to read, and it's indeed the 8-byte packet for the Bulk OUT endpoint as expected. Repeat this process a couple more times and I end up with the whole 31-byte CBW. It was all smooth sailing, and I was ready to clock out for the day, until I realized... I forgot to actually re-enabled the Bulk OUT endpoint after the first packet!
I became dumbfounded -- struck by a tempest of confusion! How is it possible for the whole transfer of 31-bytes to be done when the endpoint would've become disabled after receiving the first packet?
Well, after some consideration, I realized what should've happened is that the "OUT data transfer completed" pattern gets pushed into the RX-FIFO after reading in the first 8-byte packet. What ended up happening in reality was a series of "Data OUT packet" patterns (altogether making up the 31-byte CBW) and then finally the actual "OUT data transfer completed" word. But I only ever configured the endpoint for a single 8-byte packet, not this whole parade!
So I put my thinking cap on: okay, so what exactly are the values of PKTCNT and XFRSIZ of the OTG_DOEPTSIZx register after reading in each packet? Well, after the very first 8-byte packet, PKTCNT and XFRSIZ became zero (as expected), but according to the RM (pg. 1793):
The OUT data transfer completed pattern for an OUT endpoint is written to the receive FIFO on one of the following conditions:
– The transfer size is 0 and the packet count is 0
– The last OUT data packet written to the receive FIFO is a short packet (0 ≤ packet size < maximum packet size).
So the very next thing in the RX-FIFO should be the "OUT data transfer completed" pattern, right? Well, as I already said, this isn't the case; the Bulk OUT endpoint continues on receiving more data from the host and it is only at the last packet (a short packet in fact) that the pattern actually gets pushed into the RX-FIFO. As for what happens to the PKTCNT and XFRSIZ fields, PKTCNT stays at 0 while XFRSIZ underflows.
Of course, I could be wrong. I never happen to really run into anything that'd straight up contradict the RM like this -- usually it's some awkward wording that was a little ambiguous -- but from what I'm witnessing, this is not what is happening at all in my own painstakingly handwritten code!
To make matters worse, I decided to do something strange: what if I never configure and enable the Bulk OUT endpoint for the transfer at all? Would I still receive data packets in the RX-FIFO? As it turns out: yes!
So what gives?
After some tinkering, I came to the conclusion that the whole shebang of configuring the OTG_DOEPTSIZx and related registers for OUT transfers is completely baloney(?). The only actually important thing that needs to be done is clearing the NAK status of the OUT endpoint via CNAK. In other words, EPENA doesn't seem to be what determines whether or not the OUT endpoint receive packets.
I verified this by replacing all procedure calls that would've configured and enabled the OUT endpoint for transfers into a single line that just sets the CNAK bit, and things seem to run all perfectly fine. No issues with enumeration or CDC side of things. So it seems like the whole operational procedure of configuring OUT transfers can just be skipped...
Now it should be noted that this is just what I'm observing on the STM32L476RG. I don't have my hands on other MCUs with the USB OTG FS core, so I'm confident that things will be different due to different configurations of the Synopsys IP core, and that even varies within the same line of MCUs. This probably explains why the RM is the way it is: it's just trying to cover all bases of the weird quirks of the USB OTG FS core.
Regardless, I thought I'd share these findings for any masochists out there who wants to write their own stack. If I continue on working on the BOT interface and find out there's an important detail I'm misunderstanding (like the EPENA bit is actually important in this very specific edge case), then I'll make an update post with that information.
