Skip to main content
Graduate
August 30, 2022
Question

32F417: CDC_Receive_FS() or USBD_CDC_ReceivePacket() - any known bugs?

  • August 30, 2022
  • 23 replies
  • 4562 views

I am using USB for both MSC (removable drive) and CDC (virtual COM port).

MSC works perfectly.

CDC out has been working perfectly for ~2 years.

CDC in had never really been tested, and ~ 1 byte every 20 is corrupted.

I am running CDC to Teraterm on the PC, and the out mode has been used for debugs.

The in mode gets activated of you press keys on the keyboard.

I have a simple RTOS task which copies in data to output. That is where I see the corruption.

It is entirely fixable by disabling certain RTOS tasks! But the ones which "fix" it don't do anything related to USB. I am thus suspecting there is a critical region which is unprotected and RTOS related task switching messes it up.

These functions should be well known because it is generated by Cube MX. I don't use MX (the code generated tends to work but is bloated junk and often works by luck because the cloated code covers up timing issues) but it was used when this project was being created ~ 3 years ago by someone else.

    This topic has been closed for replies.

    23 replies

    PHolt.1Author
    Graduate
    September 1, 2022

    Tracing the TX code it indeed looks like the TX path just sets up some memory values, and does not do any obvious hardware interaction, so something is monitoring those values (struct members) and kicks off the TX.

    It is like ETH, where the 32F4 ETH controller is continually monitoring some pointers in a linked list of packets, to see if any new ones have been left there. I spent a whole load of time there too but that now seems to be all working fine. (Actually I have some questions on that which I will post separately).

    The TX path is

    CDC_Transmit_FS

    USBD_CDC_TransmitPacket

    USBD_LL_Transmit

    HAL_PCD_EP_Transmit

    and in there, just some hpcd-> value gets set

    0693W00000SuIhlQAF.png 

    so "something else" is monitoring the memory. And presumably enabling the interrupt you mention.

    I would happily test out suggestions for any low hanging fruit but I am out of ideas : - )

    I doubt the RTOS is at all relevant, except to the extent of exposing an issue via timing being "just right".

    Super User
    September 1, 2022

    As I've said, the lowest I see hanging is USB_ReadPacket().

    JW

    PHolt.1Author
    Graduate
    September 5, 2022

    It is hard to debug this one because MSC shares half the functions in the function stack with CDC, and windows is polling the MSC device at about 1Hz.

    If there was a way to disable the MSC polling then I could trace the CDC data through.

    Super User
    September 5, 2022

    Unfortunately no way to suspend poling in Windows. For this you need your own host which you can modify (Linux or another MCU with USB). Try to make a faster trace, to ITM or just write to RAM buffer.

    Graduate II
    September 5, 2022

    Even a single fact that the ST's USB stack does everything in interrupt context is enough for me to never use it. And the malloc() call is the best of it all! Knowing all of it and the great ST's code quality and team's competence level, when I needed an USB stack, I just went strait to TinyUSB. Integrating it in my project was a breeze and it works reliably. Of course, I cannot stop you from enjoying the fun of fighting against another broken bloatware from ST, but just in case:

    https://github.com/hathach/tinyusb/tree/master/examples/device/cdc_msc_freertos

    PHolt.1Author
    Graduate
    September 6, 2022

    I don't have any heap usage (stupid idea in embedded work) in my project, except

    • optional features where a block is allocated and stays allocated until power-down
    • MbedTLS runs a private heap within a static block which you give it (this is unavoidable)

    I don't know why TinyUSB was not used for this project when it was started (not by me) but probably because it was started c. 2018 and TinyUSB wasn't around then, looking at the project dates. The guy who did it used the Cube MX "code generator" to generate the skeleton code. Like many others, he expected ST code to work!

    MSC is rock solid and fast. It is just CDC "out" (PC to target) where I have a problem.

    I will try ITM debug. Should be fast enough; use it already. Thanks. Problem is I don't really understand where to look. The structure (no pun intended) of that stack is right on the limit of my C competence. ST just love a struct for everything. Probably packet buffers in those two IRQ handlers.

    PHolt.1Author
    Graduate
    September 6, 2022

    The 1st keystroke, always corrupted in CDC_get_rx_bytes, is delivered correctly into fred1234[0] here. I trigger the breakpoint, line 984, with "3" on the terminal, and this enables it to isolate the issue

    0693W00000Sub4xQAB.png 

    This is CDC_get_rx_bytes:

    0693W00000Sub65QAB.png 

    It is all the stuff in between where I struggle to find the data. Various buffers get passed by address.

    0693W00000SubLKQAZ.png 

    AFAICT the data (the byte, and count=1) is right oin here (called from an ISR) and USBD_CDC_DataOut returns the right data also.

    0693W00000SubV5QAJ.png

    PHolt.1Author
    Graduate
    September 6, 2022

    OK, found it and fixed it. It was junk in CDC_Receive_FS().

    Various junk, but most notably it was enabling reception of next packet before reading out the previous one. Most of the time this would work (and did) unless the RTOS got in just at the wrong point and delayed things just long enough.

    Final code:

    0693W00000SubtvQAB.png 

    One ISR loads data into a 64 byte linear buffer and another ISR calls (eventually) CDC_Receive_FS() where in the above example this buffer gets copied into a 1024 byte circular buffer. This is the flow:

    0693W00000SudBHQAZ.png 

    "Even a single fact that the ST's USB stack does everything in interrupt context is enough for me to never use it."

    On the one hand you have a go at me for polling ETH RX and on the other you say the above.

    In the end it turned out that the USB CDC "OUT" ISR chain was delivering the right data all along.

    My criticism of "HAL" is that it has poor granularity. All the code is basically bloated. But it shows up especially in the ISRs, which people rarely look at because by their nature they are convoluted, and interrupt generation (IP bits) behaviour is often nontrivial. ISR-driven software tends to be convoluted anyway. It is only when one starts to step through one of the ex-MX generated ISRs that one sees just how bad it is. It is stepping through a large number of possible int sources (most of which are not enabled anyway). They get away with this at 168MHz and 1 clock per cycle... You would never write ISRs that way yourself; it would do only what is needed and then gets out of there.

    Also a lot of the code comments are simply wrong - probably due to the coder not speaking English.

    Super User
    September 6, 2022

    CDC_Receive_FS() is apparently your code. If it originated from CubeMX-generated, it would be interesting to see what exactly did CubeMX generate.

    JW

    PHolt.1Author
    Graduate
    September 6, 2022

    0693W00000SudfRQAR.png 

    I think USBD_CDC_ReceivePacket(&hUsbDeviceFS) is supposed to go after the copy loop.

    This is the init code

    0693W00000SudgKQAR.pngand that buffer was 2k, not 64 bytes, so not sure what the intention was there.

    I just looked up some code from a year ago, which may be the original.

    Super User
    September 6, 2022

    Isn't code between /* USER CODE BEGIN x*/ and /* USER CODE END x*/ user code?

    OK we probably won't know unless somebody tries to generate this in CubeMX (and that in the edition which was used to generate this).

    Not that interested, I see little point in wasting much time with Cube/CubeMX generally.

    JW