Skip to main content
Visitor II
January 9, 2025
Question

HOW to use ART Accelerator in STM32F7xx

  • January 9, 2025
  • 1 reply
  • 1391 views

Hello,

We are trying to make use of the ART accelerator of the STM32F7.
As a start we placed only a single part of code in the adress range of 0x00200000.
And we enabled the ART accelerator in CubeMx as well as code prefetch.
However when we start the code with the debugger it does not even come the main(void) function. When going back to the reset entry, the call to SystemInit works, but the branch to __main immediately results in a hard fault.
This is strange because this part of the code resides in the standard flash adress of 0x08000000 still. Even in the disassembler window there was no single step possible. (Normally you can see the scatter loader working here …)
The fault does not occur as soon as we remove the code at 0x00200000 in the scatter file.
So just a simple question:
Could you please provide an example with any STM32 cpu that shows us the usage of ART accelerator?

Thank you very much for your help

Andreas

    This topic has been closed for replies.

    1 reply

    Technical Moderator
    January 9, 2025

    Hello,

    The issue is not very clear but you can refer to X-CUBE-32F7PERF package that comes with the application note AN4667 "STM32F7 Series system architecture and performance".

    See the projects: 1 FlashITCM-RAM_DTCM and FlashITCM-RAM_SRAM1. The code execution is done in the FlashITCM that adress starts from 0x00200000.

    The ART enable is done in HAL_Init():

    HAL_StatusTypeDef HAL_Init(void)
    {
     /* Configure Flash prefetch and Instruction cache through ART accelerator */ 
    #if (ART_ACCLERATOR_ENABLE != 0)
     __HAL_FLASH_ART_ENABLE();
    #endif /* ART_ACCLERATOR_ENABLE */
    
     /* Set Interrupt Group Priority */
     HAL_NVIC_SetPriorityGrouping(NVIC_PRIORITYGROUP_4);
    
     /* Use systick as time base source and configure 1ms tick (default clock after Reset is HSI) */
     HAL_InitTick(TICK_INT_PRIORITY);
     
     /* Init the low level hardware */
     HAL_MspInit();
     
     /* Return function status */
     return HAL_OK;
    }

    Note that this is an old application note and STM32CubeIDE was not available at that time, only IAR, KEIL and System Workbench (Eclipse based) were used.

    If you are execution from the FlashITCM you need to enable the ART to increase the performance, enabling/disabling the ART which doesn't have impact on the execution from the FlashAXI 0x08000000.

    andywild2Author
    Visitor II
    January 9, 2025

    Hello SofLit,

    Thank you for the example  X-CUBE-32F7PERF 

    I saw in the example that the scatterfile locates ALL of the flash code to the adress starting from 0x00200000 which is the TCM Flash.
    And the loader loaded it to the adress 0x00200000. But how is this possible? Flash writing is only possible in the 0x08000000 domain according to reference manual.

     

    #### Is the loader internally using 0x08000000 and just pretending to use 0x00200000? ###

     

    Anyhow in this example the debugger reaches _main().

    I tried the same scatterfile in my project, but unfortunately even with ART + Prefetch Enabled the TCM-Flash is slower that FlashAXI with Instruction Cache. So I do not understand the purpose of ART at all...

    Thank you very much for your kind answer

    Andreas

    Technical Moderator
    January 9, 2025

    There are two addresses for the Flash 0x00200000 over ITCM and 0x08000000 over AXI.

    Please read the  AN4667 "STM32F7 Series system architecture and performance" especially the section 1.5.1 Embedded Flash memory.

    It's the same physical Flash but could be accessed over two address ranges. 

     


    @andywild2 wrote:

    I tried the same scatterfile in my project, but unfortunately even with ART + Prefetch Enabled the TCM-Flash is slower that FlashAXI with Instruction Cache. So I do not understand the purpose of ART at all...

    Thank you very much for your kind answer

    Andreas


    Again read the AN4667 it gives a description of the product architecture and it provides some performance results of X-CUBE-32F7PERF.