STM32F7xx/STM32F4xx: SPI3 RXNE is erroneously set after enabling the peripheral
We encountered an issue with the SPI3 peripheral on F4xx and F7xx series chips where the RXNE bit of the status register would be set right after enabling the SPI peripheral after re-configuring it.
This is clearly impossible as no transaction is performed at that time yet, and reading the DR further proves it: random data is read. We have reasons to believe this is a hardware bug. No other SPI peripheral exhibits this behavior, this only happens for SPI3.
We always use the SPI peripheral in 8 bit frame mode, with software slave management and with various clock divider settings (which do not affect the behavior).
We have always made the assumption that RXNE should be `0` at the beginning of a transaction, more so because we empty the RX queue at the end of each transaction. We placed an `assert((SPIx->SR & SPI_SR_RXNE) == 0)` at the beginning of the transaction code, to enforce this assumption.
We started to notice random reboots of our boards, when running code compiled in debug mode. We later realized that the above assertion was getting triggered by our sensors' code. That's when we decided to take the issue seriously and look into it.
We have found a reliable way to trigger the erroneous behavior on F7xx chips:
- Configure the SPI3 peripheral to CPOL = 0, CPHA = 0 (mode 0)
- Perform a transaction (write followed by read). E.g. retrieve a sample from a sensor
- Re-configure the SPI3 peripheral to CPOL = 1, CPHA = 1 (mode 3)
- After enabling the SPI3 peripheral the SR register holds the value `0x203` (RXNE flag set, FRLVL set to 1/4)
We've had the assertion trigger for F4xx chips as well, although rarely. The above steps do not seem to work on F4xx chips.
We have ensured that the steps we perform when reconfiguring the SPI peripheral follow the directions from the user manual and the programming manual.
We have been successful in reproducing the issue with the official HAL libraries, the behavior is exactly the same. This effectively rules out a bad implementation of the SPI driver on our side.
Stepping through the HAL code with a debugger shows that the SR register holds the `0x203` value after enabling the SPI peripheral, just like in our driver. Because of how the HAL functions are implemented, the RXNE flag is never checked when transmitting, and the RX queue is flushed afterwards, so the error is never checked nor caught.
We have opened an internal issue, where you can find our code and our proposed workaround to this issue: https://git.skywarder.eu/avn/swd/skyward-boardcore/-/merge_requests/235
I have attached a minimal entrypoint that is able to reproduce the issue. The entrypoint was developed for a STM32F767ZI Nucleo development board. Note that no sensor is required to trigger the bug, only the board alone.
