Skip to main content
Explorer
April 4, 2025
Solved

STM32U5 OSPI BusFault in memory mapped mode on unaligned writes

  • April 4, 2025
  • 4 replies
  • 1317 views

Good day,

I'm experiencing an unexplained bus fault when the MCU writes to unaligned (not aligned to 32-bit) addresses in the OSPI memory region. I've written a minimal implementation that shows where the issue occurs. See below.

I'm using the STM32U585 with two IS66WVS4M8BLL quad SPI modules. The OctoSPI peripheral is configured in Dual-Quad Mode. See the image for the settings as configured in STM32CubeMX. I use memory mapped mode. I've enabled DQS for writes as per 2.6.1 of the errata. I've tried different MPU cache configurations for this area: no MPU, MPU_DEVICE_nGnRnE, and (MPU_WRITE_THROUGH | MPU_NON_TRANSIENT | MPU_W_ALLOCATE | MPU_R_ALLOCATE).

burn__0-1743753407398.png

Here's the minimal implementation. Note that 8-bit writes work if the previous write was aligned. See comments surrounded by ***, focussing on Test 2.

 // Test 1: Unaligned 8-bit access
 // *** This first test works without issues ***
 volatile uint32_t sramByteLoopIndex;
 volatile __IO uint8_t *mem_addr_byte;
 volatile uint8_t sramTestValByte;

 // Writing Sequence (8-bit, unaligned pattern)
 mem_addr_byte = (uint8_t *)(OCTOSPI1_BASE);
 for (sramByteLoopIndex = 0; sramByteLoopIndex < OSPI_MEM_SIZE_BYTES;
 sramByteLoopIndex++)
 {
 sramTestValByte = (uint8_t)(sramByteLoopIndex &
 0xFF); // Use lower 8 bits of index as pattern
 *mem_addr_byte = sramTestValByte;
 mem_addr_byte += 1;
 }

 // Reading Sequence (8-bit, unaligned check)
 mem_addr_byte = (uint8_t *)(OCTOSPI1_BASE);
 for (sramByteLoopIndex = 0; sramByteLoopIndex < OSPI_MEM_SIZE_BYTES;
 sramByteLoopIndex++)
 {
 sramTestValByte = (uint8_t)(sramByteLoopIndex & 0xFF);
 assert_param(*mem_addr_byte == sramTestValByte);
 mem_addr_byte += 1;
 }

 // Test 2: Unaligned Copy within SRAM
 // *** Here things are not well ***
 volatile __IO uint8_t *src_addr_byte;
 volatile __IO uint8_t *dst_addr_byte;
 volatile uint8_t test_pattern_byte;
 const uint32_t copy_size_bytes = 512; // Size of data to copy
 // Ensure source and destination are unaligned and distinct
 const uint32_t src_offset = 1; // Unaligned source start
 const uint32_t dst_offset =
 (OSPI_MEM_SIZE_BYTES / 2) + 3; // Unaligned destination start

 // Ensure offsets are valid and don't cause wrap-around issues with copy size
 if ((src_offset + copy_size_bytes < OSPI_MEM_SIZE_BYTES) &&
 (dst_offset + copy_size_bytes < OSPI_MEM_SIZE_BYTES))
 {
 // Dummy aligned write first
 // *** Removing this line causes BusFault on the write in code block 1 below ***
 *((volatile __IO uint8_t *)OCTOSPI1_BASE) = 0; 
 
 // 1. Fill source region with a pattern
 src_addr_byte = (uint8_t *)(OCTOSPI1_BASE + src_offset);
 for (sramByteLoopIndex = 0; sramByteLoopIndex < copy_size_bytes;
 sramByteLoopIndex++)
 {
 test_pattern_byte =
 (uint8_t)((sramByteLoopIndex + 0xAA) & 0xFF); // Arbitrary pattern
 *src_addr_byte = test_pattern_byte; // *** BusFault here on first iteration if the dummy write is commented out, otherwise no issues here ***
 src_addr_byte++;
 }

 // 2. Perform byte-by-byte unaligned copy (Isolated Read/Write)
 src_addr_byte = (uint8_t *)(OCTOSPI1_BASE + src_offset);
 dst_addr_byte = (uint8_t *)(OCTOSPI1_BASE + dst_offset);
 volatile uint8_t temp_byte; // Temporary variable
 for (sramByteLoopIndex = 0; sramByteLoopIndex < copy_size_bytes;
 sramByteLoopIndex++)
 {
 temp_byte = *src_addr_byte;
 __asm__ __volatile__ ("nop"); // Tried adding NOP as per 2.6.10 of errata, even though we are not using DTR
 *dst_addr_byte = temp_byte; // *** BusFault here on first iteration if the dummy aligned write is preset***
 src_addr_byte++;
 dst_addr_byte++;
 }

 // 3. Verify destination region
 // *** We never get here ***
 dst_addr_byte = (uint8_t *)(OCTOSPI1_BASE + dst_offset);
 for (sramByteLoopIndex = 0; sramByteLoopIndex < copy_size_bytes;
 sramByteLoopIndex++)
 {
 test_pattern_byte =
 (uint8_t)((sramByteLoopIndex + 0xAA) & 0xFF); // Expected pattern
 assert_param(*dst_addr_byte == test_pattern_byte);
 dst_addr_byte++;
 }
 }

I can post my memory mapped initialisation code, if needed.

Any suggestions of what I might be missing, or how to move forward?

@Alex - APMemory have you ever seen something like this?

    This topic has been closed for replies.
    Best answer by Alex - APMemory

    Not sure if this helps.

    It seems there are something done wrong on STM32 MCU. befow domments from our team:

    1, Since Test 1 is good, basic read/write on dual-quad SPI configuration is working.  Effectively, it's OPI with DMM = 1 (dual-memory mode).

    2. According to the data alignment constraints, it becomes halfword-addressable memory, not byte-addressable. 

    3. Test 2 doesn't seem to honor the constraints, as it touches the odd address in the very first PSRAM read. I suspect it is the MCU cache that shadows the unaligned access in Test 1, where the cache is always filled by the previous aligned access. In another words, the cache absorbs the unaligned accesses, such the PSRAM sees no accesses with odd addresses. Once this assumption is broken (as cache is disabled or access an odd-address when cache is cold), there will be unaligned access to PSRAM. 

     

    Relevant constraints:

    AlexAPMemory_0-1744038729529.jpeg

    (default write-through) MCU cache is preheated by write:

    AlexAPMemory_1-1744038748465.jpeg

     

     

    4 replies

    Explorer
    April 4, 2025

    Hi, 

    I'm not sure how to help.

    I've no idea if ISSI QSPI works. 

    But also I'm not sure if I understand your configuration well.  You could use our QSPI (APS1604M..., APS6404L..., SOP8/USON8) or our OPI (APS6408L, ...BGA24) and all set up should be available in cube. 

    I you use 2 QSPI for some reason I guess you need to use QSPI set up with 2CE to activate one or the other memory.

    Regards

    Alex

    PS: Looking at Mouser, it seems APMemory 64Mb QSPI is half of price for twice the density  than ISSI 32Mb

    https://eu.mouser.com/c/?q=IS66WVS4M8BLL

    https://eu.mouser.com/c/?q=aps6404L-3SQ

     

    burn_Author
    Explorer
    April 7, 2025

    Hi @Alex - APMemory , thanks for your thoughts. 
    Wish I had known about the listed AP parts at hardware design time - will definitely consider them if we do a hardware revision.

    As for the 2 QSPI modules: Dual-quad SPI splits up a byte and writes 4 bits to each RAM module.

    Explorer
    April 7, 2025

    You are right about dual quad, this should work

     

    Let me share overview of supported device/SoC

    https://wiki.st.com/stm32mcu/index.php?title=Introduction_to_external_serial_memories_XSPI_interoperability_for_STM32&oldid=63515

    AlexAPMemory_0-1744011169802.png

     APMemory IoT RAM Solution
    STM32 MCU familyHPI/OPIOPIQSPI SDRQSPI DDR
    STM32L4Rx-✓*--
    STM32L5
    STM32L4P5/Q5
    STM32U575/585
    STM32H5
    -
    STM32H7A3/B3
    STM32H72x/3x
    -✓*
    STM32U59x/U5Ax, STM32U5Fx/U5Gx
    STM32H7Rx/Sx
    STM32N6
    All STM32 supporting NOR QSPI--✓*-
    APMemory device256Mb~512Mb
    1.8V
    BGA24/WLCSP
    APS256XXN-OBR/OB9-...
    APS512XXN-OBR/OB9-…
    64Mb~512Mb
    1.8V ~3V
    BGA24/WLCSP
    APS6408L-xOBM-...
    APS12808L-xOBM-BA
    APS12808O-OBR-WB
    APS25608N-OBR-BD
    APS51208N-OBR-BD
    16Mb~128Mb
    1.8V ~3V
    SOP8/USON8/WLCSP
    APS1604M-xSQR-…
    APS6404L-xSQR-...
    APS12808O-SQRH-WA
    128Mb
    1.8V
    WLCSP
    APS12808O-DQ-WA
    20pins, up to 1GB/s11pins, up to 400MB/s6pins, up to 72MB/s7pins, up to 166MB/s
    burn_Author
    Explorer
    April 7, 2025

    Thanks for this, @Alex - APMemory . Helpful for when revision time rolls around.

    burn_Author
    Explorer
    April 7, 2025

    As for the original issue I'm facing, perhaps someone from ST can weigh in? Based on other threads, perhaps @KDJEM.1 or  @mƎALLEm has seen something like this before?

    @BDoon.1, did you perhaps come accross something like this while working with the OSPI peripheral? (based on this thread)

    Explorer
    April 7, 2025

    Not sure if this helps.

    It seems there are something done wrong on STM32 MCU. befow domments from our team:

    1, Since Test 1 is good, basic read/write on dual-quad SPI configuration is working.  Effectively, it's OPI with DMM = 1 (dual-memory mode).

    2. According to the data alignment constraints, it becomes halfword-addressable memory, not byte-addressable. 

    3. Test 2 doesn't seem to honor the constraints, as it touches the odd address in the very first PSRAM read. I suspect it is the MCU cache that shadows the unaligned access in Test 1, where the cache is always filled by the previous aligned access. In another words, the cache absorbs the unaligned accesses, such the PSRAM sees no accesses with odd addresses. Once this assumption is broken (as cache is disabled or access an odd-address when cache is cold), there will be unaligned access to PSRAM. 

     

    Relevant constraints:

    AlexAPMemory_0-1744038729529.jpeg

    (default write-through) MCU cache is preheated by write:

    AlexAPMemory_1-1744038748465.jpeg

     

     

    burn_Author
    Explorer
    April 9, 2025

    Hi @Alex - APMemory ,

    Thanks for reaching out to your team. Your description of the problem makes perfect sense. The implication is that I'm also likely loosing data when an odd number of bytes is written.

    Another helpful clarification is that it is halfword-addressable memory (I assumed word-addressable would be the solution).

    I'll construct an example with caching disabled and see if test 1 then also fails.

    Explorer II
    April 8, 2025

    @burn_ I did experience something like this.  If you look in the STM32U585 Errata, in the OCTOSPI section, there are a number of items related to 4-byte boundaries.  2.6.7 "Read data corruption after a few bytes are skipped when crossing a four-byte boundary" in particular seems bad, but 2.6.5 and 2.6.10 are also concerning for an application that just wants to treat the memory like it's directly addressable.

    Almost all of the workarounds listed for these issues involve enabling the DCache.  Which is what I did, and it did resolve the issues I was seeing.

    burn_Author
    Explorer
    April 9, 2025

    Hi @BDoon.1 ,

    Thanks for your input.

    I was convinced that the issues listed in the errata was my issue as well. However, in my case the problem persisted despite extensively playing with the cache settings.