Skip to main content
Visitor II
February 9, 2024
Solved

STM32U5xx: OCTALSPI (as QSPI) fails for faster speed, with 1V8 it fails more drastically

  • February 9, 2024
  • 4 replies
  • 4305 views

This is more a bug report, not a question.

Background:

I want to use OCTALSPI as QuadSPI (QSPI), esp. with VDD set to 1V8. I want to get (at least) 30 MHz SCLK working on QSPI.

But I cannot go faster as 12.195 MHz, which is associated with a OCTALSPI clock divider set to: 13.

Symptoms:

VDD = 3V3:

down to clock divider = 13 - all looks fine
clock divider lower (for faster speed): looks different on SCLK but might be OK (still reasonable, even with gaps)
the minimum for clock divider with VDD = 3V3 is: 6 (resulting in 27.778 MHz - I want to see 30 MHz working)
anything faster (smaller clock divider) FAILS: there is no SCLK anymore
VDD = 1V8:

all fine down to clock divider = 13 - the same as 3V3
clock divider = 12 - FAILS completely! no SCLK anymore!
with VDD = 1V8: I cannot even reach the same speed as with 3V3
Details on FW:

I use the standard HAL functions. My QSPI ends up in calling "HAL_QSPI_Transmit()". This is running in "polling mode": the data is written to FIFO register with looping and waiting until it was sent (no INT, no DMA):

   status = OSPI_WaitFlagStateUntilTimeout(hospi, HAL_OSPI_FLAG_FT, SET, tickstart, Timeout);

The OCTALSPI clock source is:

PeriphClkInit.PeriphClockSelection = RCC_PERIPHCLK_OSPI;

PeriphClkInit.OspiClockSelection = RCC_OSPICLKSOURCE_SYSCLK;

Waveforms:

STM32U5A5_QSPI_issues3.png

STM32U5A5_QSPI_issues4.png

STM32U5A5_QSPI_issues1.png

STM32U5A5_QSPI_issues2.png

STM32U5A5_QSPI_issues5.png

VDD = 1V8:
It fails immediately with CLKDIV = 12 - where it was working still (a bit) with 3V3!

What is wrong?

The goal is:

QSPI with VDD = 1V8 and 30 MHz
the SCLK should look constant (no gaps, continuous)
if gaps are on SCLK - caused by "SW polling mode" - OK:
how to change and use DMA based functions?

    This topic has been closed for replies.
    Best answer by tjaekel

    OK, I am closing this ticket.

    Observations:

    1. the "byte bursts" depend on Debug code vs. Release code:
      if I set optimization to -g0 and -O3 - it happens later.
      But it happens: the longer the QSPI transaction, or the faster the speed - the following data words come as "byte burst" (not as words burst, never mind what the FIFO setting is).
    2. The GPIO Speed setting has FOR SURE a dramatic impact, if it is working or not (Speed setting as "fastest" makes it working for faster QSPI speed, slower GPIO speed makes the QSPI failing).

    I am closing this ticket, even I did not have a clue how to avoid the "byte burst" and why GPIO speed setting matters so match to see a SCK signal on scope.

    4 replies

    Technical Moderator
    February 9, 2024

    Hello @tjaekel 

    Would you provide more details about your hardware setup? Full path of your clock source selected (source of SYSCLK).Also, for 1V8, did you enable High Speed Low Voltage HSLV?  

     

    Technical Moderator
    February 9, 2024

    Hello @tjaekel ,

    Could you please give more details about the issue:

    - Which STM32U5 device are you using?

    -Are you using an ST board or customer board?

    Note that the OCTOSPI frequency depends on CL capacity, For that please refer to the datasheet device OCTOSPI characteristics table and check all constraints.

    KDJEM1_0-1707473455055.png

    Please take a look to an OCTOSPI example may help you.

    Thank you.

    Kaouthar

    tjaekelAuthor
    Visitor II
    February 9, 2024

    Thank you.

    It is a NUCLEO-U5A5ZJ-Q board.
    There is just a scope connected, not any chip (no load).

    I will check today manual and what this HSLV is.

    I do not change the voltage range (no dynamic voltage scaling used).
    QSPI is in SDR mode (not DDR/DTR, no Hyperbus).

    tjaekelAuthor
    Visitor II
    February 10, 2024

    It FAILS still, the same way as before (I have studied datasheet and RM and tried several things).

    BTW: it fails on a STM32U5A5 MCU a bit earlier compared to a STM32U575 MCU:

    • on STM32U575, with VDD = 3V3: I can lower the OCTOSPI divider one step more: U575 is a bit better as U5A5
    • but on VDD = 1V8 - both fail in the same way (same OCTOSPI divider fails)

    STMU575 is slightly better in terms of OCTOSPI speed.

    I tried:

    1. use SYSCLK or PLLQ, both set to 160 MHz - no difference (see below Remark-1)
    2. use HSLV - no difference
    3. use LL_SYSCFG_EnableVddCompensationCell(); - no difference

    Remark-1

    I see in RM that U5A5 and U575 have also PLL2 and PLL3. And it let me configure PPL2 and PLL3 in STMCubeMX.

    RM says: PLLxQ for QSPI (only) can be also as 200 MHz. But not possible to set 200 MHz on NUCLEO board with 16 MHz OSC. I would need to use PPL2Q (with 200 MHz).

    But the HAL drivers for U5A5, U575 (U5xx) support only ONE PLL (nothing with PLL1 or PLL2). Another issue that HAL drivers for U5xx do not support PPL2, PLL3?

    I can confirm:

    • SYS clock is 160 MHz (or PLLQ) - used for QSPI as clock source
    • I run voltage range 1 (I have checked by reading back via HAL_PWREx_GetVoltageRange() >> 16)
    • I am not changing to use DVS (dynamic voltage scaling)
    • "overclocking" the MCU, e.g. with 192 MHz: MCU works still (also USB), but now the OCTALSPI fails already on the "just working" DIV setting (now 13 fails which was working with 160 MHz - issue seems to be in OCTOSPI)
    • running VDD = 1V8 makes it worse (OCTOSPI fails already on lower speeds, I cannot use smaller DIV),
      on VDD = 1V8 it fails immediately on faster speed

    HSLV config

    As mentioned in STM documentation: HSLV is very risky! It can damage the chip, e.g. HSLV enabled but VDD = 3V3.
    The NUCLEO boards have a VDD jumper! So, we had to "follow" what the configured VDD is. I do via ADC and measuring Vrefint and just if below 1.9V - I enable QSPI pins for HSLV. But no difference in speed.

    Project and details

    Find the details in my project, on GitHub:

    https://github.com/tjaekel/NUCLEO-U5A5JZ-Q_QSPI 

    Other Remark

    The fact that the data words are spread out now (and I see gaps in SCLK) might be obvious: I use data transfer via OSPI_WriteReadTransaction() which sends all in indirect mode. All is based on SW polling (checking the FIFO status) and FW can be slow and cause these gaps (but fine).

    But:
    I have checked if I can use OCTOSPI (as QSPI) with DMA (in indirect mode). It looks to me, it is not possible (at least not mentioned/documented if OCTOSPI can generate a DMA event).

    Conclusion

    I am frustrated. I have changed to STM32U5xx because of the QSPI support (even it lacks some features, like "regular SPI"). Now I realize that it does not work as specified (in datasheet), e.g. to get 93 MHz SCLK in voltage range 1, even on VDD = 1V8. I get just 27 MHz maximum (and just with 3V3, not 1V8).

    It does not work for me as I need (1V8 and 30 MHz).

    And the Errata document is already pretty long for OCTOSPI (10 entries already). Maybe you had to add a new one and correct datasheet and RM for a speed limitation on OCTOSPI.   ;) LOL

    More joking

    I guess, you have a timing constraints violation in your MCU RTL. Send me your RTL and I could debug.
    Or does STM solder "slow corner" (yield) parts on NUCLEO board?

     

    Graduate II
    February 12, 2024

    Following this with interest but @tjaekel definitely wins the award for "Most Creative Font and Color Use".  8)

    tjaekelAuthorAnswer
    Visitor II
    February 13, 2024

    OK, I am closing this ticket.

    Observations:

    1. the "byte bursts" depend on Debug code vs. Release code:
      if I set optimization to -g0 and -O3 - it happens later.
      But it happens: the longer the QSPI transaction, or the faster the speed - the following data words come as "byte burst" (not as words burst, never mind what the FIFO setting is).
    2. The GPIO Speed setting has FOR SURE a dramatic impact, if it is working or not (Speed setting as "fastest" makes it working for faster QSPI speed, slower GPIO speed makes the QSPI failing).

    I am closing this ticket, even I did not have a clue how to avoid the "byte burst" and why GPIO speed setting matters so match to see a SCK signal on scope.

    Technical Moderator
    February 13, 2024

    Hello @tjaekel ,

    Thank you these interesting details and explanations.

    It is mentioned in the datasheet that the Octo-SPI  pins support 'very high' and 'high' functionality. I think the AN5050 precisely section 6.2.3 OCTOSPI GPIOs and clocks configuration can help you to configure the OCTOSPI GPIO pins.

    For the byte burst issue could you please try to disable the optimization. Note that it is recommended to use compiler optimization level –O0 when building a project that must be debugged. Debugging with optimization level –Og may work but higher optimization level is hard to debug because of compiler code optimization. For more details please refer to  UM260 "STM32CubeIDE user guide" section 3. Debug.

    For STM32 U5, any of four different clock sources (SYSCLk, MSIK, pll1_q_ck, pll2_q_ck) can be used for the OCTOSPI clock source. So, PLL3 can't be used as OCTOSPI clock source.  

    Thank you for your contribution in STCommunity :).

    Kaouthar

     

    tjaekelAuthor
    Visitor II
    February 15, 2024

    Thank you.
    Yes, AN5050 says clearly on page 27:

    Note: All GPIOs have to be configured in very high-speed configuration.

    This is the answer (and confirms what I have realized).

    I am aware of the debug and optimization flags (for debug I use -g3 and -Og).
    I was trying to get rid of the byte bursts on QSPI. And this optimization setting has a small influence: byte bursts happen later now (when set for -none and -O3).

    Just not yet successful to have a "gap-less" stream of words on QSPI. After a while it turns into "byte bursts" (two clock cycles for 8bits but a gap between these bytes, even all as 32bit words). Using the FIFO (other thresholds) does not help.
    But OK: it works still (waveform is correct). All fine for now.

    Graduate II
    February 15, 2024

    DMA.  The Core can't keep up at the higher QSPI clock rates.