Skip to main content
Explorer II
January 23, 2024
Solved

STM32MP157 GPIO access

  • January 23, 2024
  • 4 replies
  • 5230 views

Hi team,

I have an STM32MP157 board with a Cortex-M4 processor. I have a simple task, which is to toggle the GPIO at maximum speed with accuracy, ensuring that the clock width does not vary.

The maximum toggle speed we require is approximately 15ns between the rising and falling edges. Using the Keil IDE, it is possible to achieve this speed with the Nucleo F746ZG board, which has the same speed as the STM32MP157 for the Cortex-M4 processor.

I have generated code in Cube IDE, but the speed we achieved for the STM32MP157 Cortex-M4 is slow, around 60ns. The code used to toggle the GPIO in a while loop is as follows:

GPIOF->BSRR = LED_Pin << 16;
GPIOF->BSRR = LED_Pin;

I appreciate any assistance in improving the toggle speed for the STM32MP157 Cortex-M4.

Thank you.
Kundan Jha

    This topic has been closed for replies.
    Best answer by PatrickF

    Apart code optimization (but Cortex-M4 would definitely not match Cortex-M7+caches), for GPIO bit banding operation (quite unusual), instead of doing it with CPU only, you might use DMA to write a prepared data table from SRAMx to GPIOF->BSRR. That's could run fast but need slight preparation time by the CPU and DMA programming.

    Regards.

    4 replies

    Technical Moderator
    January 23, 2024

    Hi @kundanJha ,

    STM32F7 is using a Cortex-M7 with L1 cache which is much powerful then Cortex-M4 present in STM32MP15.

    On STM32MP15, you might get slightly better performance by putting Code inside SRAM1 (@0x10000000) and Data in RETRAM (@0x00000000) or SRAM2_SBUS (@0x30020000), but I fear 15ns toggling is not achievable by SW.

     

    What is the purpose of using 100% of the Cortex-M4 CPU to toggle a simple GPIO ?

    Might be better to use a TIMer which is designed for that, allowing to use Cortex-M4 for other tasks.

    Regards

     

    Technical Moderator
    January 23, 2024

    Don't know if it helps as code is probably well optimized by the compiler, but you could try this:

    GPIOF->BRR = LED_Pin;
    GPIOF->BSRR = LED_Pin;

    kundanJhaAuthor
    Explorer II
    January 23, 2024

    Hi, I ave tried this, but nothing has changed. The output remains the same.

    kundanJhaAuthor
    Explorer II
    January 23, 2024

    What is the purpose of using 100% of the Cortex-M4 CPU to toggle a simple GPIO ?

     

    Ans: Actually we are developing a device, where we have required to run multiple signals at a time to get proper result. I am sharing our signals screenshot. please see attached file.

    kundanJhaAuthor
    Explorer II
    January 29, 2024

    Hi  @PatrickF ,

    I'm looking for a board with a Cortex-M7 processor and multi-core support. The signal requirements are provided in the attached file ("printer_signal.png").

    Thanks and regards

    Kundan Jha

    Technical Moderator
    January 29, 2024

    Hi @kundanJha 

    There is no Cortex-A+Cortex-M7 product, but you could have a look to:

    - STM32H7 series : Cortex-M7+Cortex-M4 https://www.st.com/en/microcontrollers-microprocessors/stm32h7-series.html

    - STM32MP25 series: Cortex-A35(Linux)+Cortex-M33 (sampling now, available in second half of 2024): STM32MP2 MPU series 64-bit microprocessors with neural processing unit

    Cortex-M33 is not as powerful than Cortex-M7, but compare to Cortex-M4 in STM32M15, it run in STM32MP25 at twice the frequency and have instruction and data caches (so could probably achieve same or faster GPIO toggling than the STM32F7).

     

    Anyway, I guess that the GPIO sequence you mention could certainly be achieved with DMA on existing STM32MP15 Cortex-M4 product (but you probably need to rework your SW concept).

    Regards

    PatrickFAnswer
    Technical Moderator
    January 23, 2024

    Apart code optimization (but Cortex-M4 would definitely not match Cortex-M7+caches), for GPIO bit banding operation (quite unusual), instead of doing it with CPU only, you might use DMA to write a prepared data table from SRAMx to GPIOF->BSRR. That's could run fast but need slight preparation time by the CPU and DMA programming.

    Regards.

    Super User
    January 29, 2024

    Hi,

    Just a question: you have set the optimizer ?  Because in my simple speed tests, on F4 core (F411 at 100MHz ), i got 60ns pin toggle (-O0 ) , but 10ns with optimizer -O2 .

    And on H563 (M33 core, at 250MHz ) 4 ns .  :)

    see:

    https://community.st.com/t5/stm32-mcus-products/which-is-best-stm32-mcu-for-fast-driving-of-gpio/m-p/632945/highlight/true#M233793

     

    +

    But the "better way" to get a pattern to the port pins, is using the DMA , as @PatrickF  mentioned .