Skip to main content
Graduate
April 16, 2024
Question

Memory mapped mode write sizes for PSRAM APS6408L on B-U858I-IOT02A Discovery Kit

  • April 16, 2024
  • 12 replies
  • 6022 views

@Alex - APMemory I am trying interface APS6408L on the B-U858I-IOT02A Discovery Kit in OSPI DTR memory mapped mode for both read and writes. The datasheet for APS6408L mentions that the page size for it is 1Kb. 

 

In the memory mapped mode, is it possible to write in a chunk greater than 1Kb for the RAM? I am read this for the memory in the datasheet, 

"Burst Type & Length
Read and write operations are default Hybrid Wrap 32 mode. Other burst lengths of 16, 32, 64 or 1K bytes in
standard or Hybrid wrap modes are register configurable(see Table 20). The device also includes command for
Linear Bursting. Bursts can start on any even address. Write burst length has a minimum of 2 bytes. Read has no
minimum length. Both write and read have no restriction on maximum burst length as long as tCEM is met."

 

I am using memcpy for writing the data in the memory mapped mode. My question is, is it possible to write in blocks greater than 1Kb for this memory?

 

-Rikesh

 

 

 

    This topic has been closed for replies.

    12 replies

    Explorer
    April 16, 2024

    Hi Rikesh,

    Maximum Write is 1 page (1024Byte for 64Mb, 2048Byte for 256Mb), and not limited for Read (RBX), subject that maximum burst length as long as tCEM is met.

     

    AlexAPMemory_0-1713300465984.png

     

    I hope this help to reply to your question

    Alex

     

    Graduate
    April 16, 2024

    @Alex - APMemory , Thanks for your prompt response. I was assuming that the maximum write burst length of 1024 is only applicable for linear burst write(A0h). Is this true for sync Write(command 80h) as well. 

    Untitled.png

    I was using memory mapped mode and trying to write data using etl::copy functions,

                    volatile uint8_t* flash_ptr = reinterpret_cast<volatile uint8_t*>(OCTOSPI1_BASE + address);
                    etl::copy(data, data + size, flash_ptr);
     
    Does the maximum write burst length limitation of 1Kb apply to all the write modes, 
    both indirect and memory mapped modes?
    -Rikesh
    Explorer
    April 23, 2024

    To your question: 

    1. I was assuming that the maximum write burst length of 1024 is only applicable for linear burst write(A0h). Is this true for sync Write(command 80h) as well      =>Yes, the operating is allowed.
    1. Does the maximum write burst length limitation of 1Kb apply to all the write modes, both indirect and memory mapped modes?   => Yes, it’s correct.

    Alex

    Explorer
    April 17, 2024

    Could you share share these details, or how your implementation differs these details please? 

    • Your PSRAM is memory-mapped and cached and you're copying internal SRAM to the PSRAM using the CPU (not DMA), so how you're managing the entire 1k would remain in the cache for the duration of the copy, e.g. if your code's multi-threaded, how you're ensuring that parts of the 1k aren't prematurely flushed from cache by the accesses of the other threads?
    • You'd start the write to the PSRAM by manually flushing the cache, so you're expecting the cache/AXI bus would burst the entire 1k from its start address?
    Graduate
    April 17, 2024

    @alister , Below are the details of my implementation,

     

    I am doing this for just RAM testing at present,

    * I am using the BSP driver for APS6408L provided by ST for the B-U858I-IOT02A Discovery Kit thats uses OSPI DTR in memory mapped mode.

    * I have been trying to test the RAM by disabling the cache(although I did try enabling it and did not see any difference)

    * For testing I have PSRAM read and write functions, that read and write in 32 bit aligned word on the PSRAM. I am only copying test data from SRAM to PSRAM(without using the DMA). My read and write functions are below,

     uint32_t ReadMemory(uint32_t address, uint8_t* data, uint32_t size)
            {
                // address and size should be 32bit aligned
                if (address % 4 != 0 || size % 4 != 0)
                {
                    return 1;
                }
     
                if ((address + size) > (max_ram_addr + 1))
                {
                    error_status = 2;
                }
                else if (error_status == MemErrorStatus::eNoError)
                {
                    volatile uint8_t* ram_ptr = reinterpret_cast<volatile uint8_t*>(OCTOSPI1_BASE + address);
                    etl::copy(ram_ptr, ram_ptr + size, data);
                }

                return 0;
            }

            uint32_tExtRam::WriteMemory(uint32_t address, const uint8_t* data, uint32_t size)
            {
                // address and size should be 32bit aligned
                if (address % 4 != 0 || size % 4 != 0)
                {
                    return 1;
                }
     
                // Check memory size out of bound
                if ((address + size) > (max_ram_addr + 1))
                {
                    return 1;
                }
                else if (error_status == MemErrorStatus::eNoError)
                {
                    volatile uint8_t* ram_ptr = reinterpret_cast<volatile uint8_t*>(OCTOSPI1_BASE + address);
                    etl::copy(data, data + size, ram_ptr );
                }

                return 0;
            }
     
    The read and write testing appears to be working when the size of data >= 1024. I do have a confusion here though, does the write needs to be page boundary aligned. e.g. Is it possible to write 1024bytes at address OCTOSPI1_BASE + 4?
     
    * If I do not use Dcache, my assumption is that the cache flushing does not matter. 
     
    --Rikesh
    Explorer
    April 17, 2024

    No.

    You can't test you're performing 1kb byte bursts by software, except perhaps by measuring the CPU cycles, allowing for test overheads and applying some inference. You could test using an oscilloscope or logic analyser however.

    You've mentioned you'd experimented disabling cache, but that is puzzling, as where would the 1k bytes of test data be queued except in cache, and with cache disabled, how do you expect the CPU wouldn't stall on each access while it completes an individual access to the flash? Are you expecting the AXI would start a 1k burst while your code is accessing to the 1k in the internal address space and pause the QSPI CLK while the CPU is slower and stall the CPU while the QSPI is slower? Can the STM32U5 do that? You mention "did not see any difference". Are you only comparing the data values? You need to measure the time.

    Don't list your software. Is your "memory mapped" the same as RM0456 rev 5 section 28 describes? Please explain the engineering of reading/writing the OSPI flash as 1k byte bursts while it's in memory-mapped mode.

    Graduate
    April 17, 2024

    @Alex - APMemory @alister One additional question, how does Row Boundary Crossing affect the memory mapped mode?

    Explorer
    April 17, 2024

    Hi Rikesh, 

    Linear Burst Commands are required to support RBX Read

    To write 1024B you need to write 1 full page, not crossing the Boundary (no RBX write)

    This is apply to memory, whichever memory mapped mapped mode or other

    Alex

    Graduate
    April 17, 2024

    @Alex - APMemory Does this mean I cannot do operation like memcpy for sizes greater than 1Kb?

    HOw can I handle sizes greater than 1Kb in memory mapped mode? Appreciate if you could elaborate .

     

    -Rikesh

    Graduate II
    April 17, 2024

    I'm linking to this other thread for completeness, and honestly to find / navigate later.  https://community.st.com/t5/stm32-mcus-products/stm32u585-memory-mapped-read-write-on-b-u585i-iot02a/td-p/663439

    Explorer
    April 18, 2024

    I'm sorry I can't study the circumstances in detail. My understanding is:

    • You've memory-mapped the APS6408L PSRAM. That places it at an internal address space of the STM32U5. Your code may access any address of the APS6408L PSRAM via that space. For each transfer on the OSPI bus, the address phase value will be the offset of the data from the base of that space, which will be the its address in the APS6408L PSRAM.
    • If D-cache is disabled, when the CPU accesses that internal address space, the STM32U5's CPU will stall (wait) while the STM32U5 completes its transfer to/from the APS6408L PSRAM.
    • If D-cache is enabled, when the CPU accesses that address space and finds it's not in cache, the STM32U5 will first read a cache-line size of data containing the addressed data from the APS6408L PSRAM into its cache, perform the CPU's access against that, and that will remain in the cache. If the access was a write and it changed the data in cache, the cache-line will remember it's been changed and when the CPU next accesses somewhere in the internal address space that's uncached and it must free that cache line for it, it will stall the CPU for slightly longer while it first flushes the line to the APS6408L PSRAM and then proceeds to read the new cache line as described above. If your code manually flushes the cache, the STM32U5 does this for each of the lines in cache if the addresses being flushed whose data had been changed. There are several other cache circumstances that behave much like these and you can find the details on ARM's web or in ST's applicable programmer's manual. A cache line size is 32-bytes. Each cache line address is 32-byte aligned.
    • For most code circumstances, D-cached enabled will operate faster.
    • Neither the CPU nor any STM32U5 bus can perform an access at an address that isn't divisible by its access size. So memory-mapped mode should never care about row-boundary crossings because they can't occur.

    Is your enquiry only technical curiosity? If you've observed an errors with memory-mapped mode, you should probably check whether the OCTOSPI's configured to wait long enough for the erase/program to complete.

    Not expert. If I've said anything incorrect, please advise. Thanks.

    Graduate
    April 18, 2024

    @Tesla DeLorean @Ilex , alister and I had some discussion on this topic. Do you have any thoughts on this?

    Explorer
    April 18, 2024

    The RM says both

    • "memory-mapped mode: the external memory is memory mapped and it is seen by the
      system as if it was an internal memory, supporting both read and write operations." and
    • "It is not recommended to program the flash memory using memory-mapped writes, as the internal flags for erase or programming status have to be polled. The indirect-write mode fulfills this operation, possibly in conjunction with the automatic status-polling mode."

    I'm describing stuff I'm unfamiliar. To do writes in memory-mapped mode, I expect before changing any data in a page on the APS6408L PSRAM you might:

    • save any of the page you're not modifying to RAM somewhere
    • invalidate the cache of the page in the internal address space
    • switch the OCTOSPI to status-polling mode, erase the page and either wait a safe time or use status-polling mode
    • switch the OCTOSPI back to memory-mapped mode
    • copy back into the internal address space anything of the page you'd saved plus the data you're changing
    • flush the page in cache as it's all changed

    Wear-levelling and managing preventing losing that page (it's state prior the change) if there's an unexpected power loss during the erase/program are outside the scope of this post.

    Graduate
    April 18, 2024

    @alister , I am working on a PSRAM, so I believe no erase is necessary here. My initial assumption was that with memory mapped, I would not need to worrry about the Row boundary crossing. However, I am unsure about it.

    Explorer
    April 18, 2024

    Duh. Of course!

    Explorer
    April 18, 2024

    As I'd said earlier, neither the CPU nor any STM32U5 bus can perform an access at an address that isn't divisible by its access size. So memory-mapped mode should never care about row-boundary crossings because they can't occur.