Delay between SPI transfers on STM32U5
I have to interface an STM32U575 microcontroller with a very specialized SPI device for which the market offers no alternatives. According to the spec sheet, the chip select must be pulsed between every 2 bytes, for at least 154 ns. I have to read 34 bytes from this chip every 50 us. The chip allows a maximum SPI clock rate of 24 MHz.
Although I can talk to the device, the SPI controller seems to not be fast enough.
SPI Configuration:
However, in my attempts to speed these transactions up, I have ditched the HAL for writing/reading to/from the SPI peripheral during the steady state program flow.
SPI functions:
void SPIM3_INIT(int len)
{
SPI3->CFG2 &= ~SPI_CFG2_COMM; // Set Full-Duplex mode
SPI3->CR2 = len >> 1; // Set the number of data at current transfer
SPI3->CR1 |= 1; // enable
}
int8_t SPIM3_XFER(uint8_t *txb, uint8_t *rxb, int len)
{
volatile uint16_t *ptxdr_16bits = (volatile uint16_t *)&SPI3->TXDR;
volatile uint16_t *prxdr_16bits = (volatile uint16_t *)&SPI3->RXDR;
uint16_t tx_cnt = len >> 1; // divide by 2 for 16 bit xfer
uint16_t rx_cnt = len >> 1; // divide by 2 for 16 bit xfer
SPI3->CR1 |= SPI_CR1_CSTART; // Master transfer start
while ((tx_cnt > 0UL) || (rx_cnt > 0UL))
{
if ((SPI3->SR & SPI_SR_TXP) && (tx_cnt > 0UL))
{
*ptxdr_16bits = *(const uint16_t *)txb;
txb += 2;
tx_cnt--;
}
if ((SPI3->SR & SPI_SR_RXP) && (rx_cnt > 0UL))
{
*((uint16_t *)rxb) = *prxdr_16bits;
rxb += 2;
rx_cnt--;
}
}
while (!(SPI3->SR & SPI_SR_EOT))
;
SPI3->CR1 &= ~SPI_CR1_CSTART;
SPI3->IFCR |= SPI_IFCR_EOTC; // clear end of transfer flag
SPI3->IFCR |= SPI_IFCR_TXTFC; // clear transmission xfer filled flag
return len;
}
void SPIM3_UNINIT()
{
SPI3->IFCR |= SPI_IFCR_EOTC; // clear end of transfer flag
SPI3->IFCR |= SPI_IFCR_TXTFC; // clear transmission xfer filled flag
SPI3->CR1 &= ~SPI_CR1_SPE; // disable peripheral
SPI3->IER = 0; // disable interrupts
// disable tx dma request
SPI3->CFG1 &= ~(SPI_CFG1_TXDMAEN | SPI_CFG1_RXDMAEN);
}main:
uint8_t tbuf[2] = {0};
uint8_t rbuf[2];
for (int i = 0; i < 2; i++)
tbuf[i] = i;
SPIM3_INIT(2);
while (1)
SPIM3_XFER(tbuf, rbuf, 2);
SPIM3_UNINIT();The problem is that I am seeing a large period of inactivity between transfers (about 900 ns), as shown in this logic analyzer trace:
In ideal conditions, I should be able to get the desired throughput with 20 MHz SPI. 2 bytes at 20 MHz take 800 ns to transfer, plus the 154 ns pulse --> 954 ns. Multiply by 34 bytes means I should be able to get this done in about 32 us, meeting my 50 us deadline. Actually, I should be able to get away with up to 670 ns delay between 2-byte transfers (not ideal), but 900ns will certainly not work for me.
Where is this delay coming from? Is it a function of the SPI peripheral architecture? Can my code be optimized further in some way? Can I double buffer the data? Where in the spec sheet can I find reference to these limitations? I am willing to make memory and power trade-offs to increase the bandwidth. Is there some workaround I can use to achieve higher SPI throughput given the limitations of the device I am interfacing with?
