Skip to main content
Visitor II
April 3, 2025
Solved

Execution delay different for NUCLEO-F207 and NUCLEO-F767 boards for the same set of lines

  • April 3, 2025
  • 3 replies
  • 881 views

Hi, I wrote a standard (DFT)Discreate fourier transform code (code is not the issue), in NUCLEO-F207 board using zephyr RTOS. I build the same program for NUCLEO-F767 board and flashed.

The time taken to execute DFT code in NUCLEO-F207 was 8700 micro seconds.

The time taken to execute the same code in NUCLEO-F767 is 65000 micro seconds.

FPU(Floating point unit) is enabled for NUCELO-F767 board 

Hardware and softwares are completely same for both the boards

How is it happening what is the potential cause ?

    This topic has been closed for replies.
    Best answer by mƎALLEm

    Hello,

    I don't know about that Zephyr environment. 

    But you need to refer to our documentation: Especially the AN4667 "STM32F7 Series system architecture and performance".

    In the HAL examples, to enable the data and the instruction cache:

    static void CPU_CACHE_Enable(void)
    {
     /* Enable I-Cache */
     SCB_EnableICache();
    
     /* Enable D-Cache */
     SCB_EnableDCache();
    }

     For ART it is enabled in HAL_Init():

    HAL_StatusTypeDef HAL_Init(void)
    {
     /* Configure Instruction cache through ART accelerator */ 
    #if (ART_ACCELERATOR_ENABLE != 0)
     __HAL_FLASH_ART_ENABLE();
    #endif /* ART_ACCELERATOR_ENABLE */

      So you need to check in the Zephyr environment where to enable all these features.

    3 replies

    Super User
    April 3, 2025

    @Shubhamkeshari wrote:

    Hardware and softwares are completely same for both the boards


    Hardly: they are different MCUs Cortex-M3 and -M7!

    Are you sure that the optimisation level is the same on both?

    What about clocking?

     


    @Shubhamkeshari wrote:

    FPU(Floating point unit) is enabled for NUCELO-F767 board 


    So what happens if you disable it?

    Visitor II
    April 3, 2025

    Are you sure that the optimisation level is the same on both?

    - Yes same optimization level is maintained

     

    What about clocking?

    - NUCLEO-F767

    &clk_hse {

    hse-bypass;

    clock-frequency = <DT_FREQ_M(8)>; /* STLink 8MHz clock */

    status = "okay";

    };

     

    &pll {

    div-m = <4>;

    mul-n = <216>;

    div-p = <2>;

    div-q = <9>;

    clocks = <&clk_hse>;

    status = "okay";

    };

     

    &rcc {

    clocks = <&pll>;

    clock-frequency = <DT_FREQ_M(216)>;

    ahb-prescaler = <1>;

    apb1-prescaler = <4>;

    apb2-prescaler = <2>;

    };

     

    - NUCLEO-F207

    &clk_hse {
    hse-bypass;
    clock-frequency = <DT_FREQ_M(8)>; /* STLink 8MHz clock */
    status = "okay";
    };

    &pll {
    div-m = <8>;
    mul-n = <240>;
    div-p = <2>;
    div-q = <5>;
    clocks = <&clk_hse>;
    status = "okay";
    };

    &rcc {
    clocks = <&pll>;
    clock-frequency = <DT_FREQ_M(120)>;
    ahb-prescaler = <1>;
    apb1-prescaler = <4>;
    apb2-prescaler = <2>;
    };

     

    If I disable the floating point unit in NUCLEO-F767 the delay increases

    Super User
    April 3, 2025
    Technical Moderator
    April 3, 2025

    Hello,

    As you are measuring the exec duration based on the time, need to check the System frequency first. 

    Did you enable the cache/ART accelerator on the F7 product?

    Visitor II
    April 3, 2025

    No i have not enabled cache/ART accelerator , how to enable that ?

    Technical Moderator
    April 3, 2025

    What is the environment are you using? Seems it's not a ST environment!

    Visitor II
    April 3, 2025
    /*NUCLEO-F207*/
    &clk_hse {
    	hse-bypass;
    	clock-frequency = <DT_FREQ_M(8)>; /* STLink 8MHz clock */
    	status = "okay";
    };
    
    &pll {
    	div-m = <8>;
    	mul-n = <240>;
    	div-p = <2>;
    	div-q = <5>;
    	clocks = <&clk_hse>;
    	status = "okay";
    };
    
    &rcc {
    	clocks = <&pll>;
    	clock-frequency = <DT_FREQ_M(120)>;
    	ahb-prescaler = <1>;
    	apb1-prescaler = <4>;
    	apb2-prescaler = <2>;
    };
    
    /*NUCLEO-F767*/
    &clk_hse {
    	hse-bypass;
    	clock-frequency = <DT_FREQ_M(8)>; /* STLink 8MHz clock */
    	status = "okay";
    };
    
    &pll {
    	div-m = <4>;
    	mul-n = <216>;
    	div-p = <2>;
    	div-q = <9>;
    	clocks = <&clk_hse>;
    	status = "okay";
    };
    
    &rcc {
    	clocks = <&pll>;
    	clock-frequency = <DT_FREQ_M(216)>;
    	ahb-prescaler = <1>;
    	apb1-prescaler = <4>;
    	apb2-prescaler = <2>;
    };
    mƎALLEmAnswer
    Technical Moderator
    April 3, 2025

    Hello,

    I don't know about that Zephyr environment. 

    But you need to refer to our documentation: Especially the AN4667 "STM32F7 Series system architecture and performance".

    In the HAL examples, to enable the data and the instruction cache:

    static void CPU_CACHE_Enable(void)
    {
     /* Enable I-Cache */
     SCB_EnableICache();
    
     /* Enable D-Cache */
     SCB_EnableDCache();
    }

     For ART it is enabled in HAL_Init():

    HAL_StatusTypeDef HAL_Init(void)
    {
     /* Configure Instruction cache through ART accelerator */ 
    #if (ART_ACCELERATOR_ENABLE != 0)
     __HAL_FLASH_ART_ENABLE();
    #endif /* ART_ACCELERATOR_ENABLE */

      So you need to check in the Zephyr environment where to enable all these features.