Visitor II

Question

STM32F767 Execution time is more compared with STM32F429

Forum|Forum|6 years ago
January 14, 2020
11 replies
2032 views

Tried creating OS task 100 times in Example codes with FreeRTOS taken from STM32CubeMX for both F429 and F767 and found the observations as below.

F429 - 6 Ticks

F767 - 16 Ticks

Difference - 10 Ticks

What is the reason for the delay and Is there any other way to speed up

This topic has been closed for replies.

U

Uwe Bonnes

Graduate II

Different number of wait states, Code in RAM? Show relevant parts!

R

RBGAuthor

Visitor II

Below added code snippet which is used for testing both F429 and F767 boards. Found the Tick difference between part highlighted.

R

RBGAuthor

Visitor II

Even for simple malloc observed the tick difference between F429 and F767.

For memory allocation 10000 times.

F429 - 68 Ticks

F767 - 78 Ticks

Tick difference - 10Ticks

Below is the code part.

void StartDefaultTask(void const * argument)

{

int *ptr;

/* USER CODE BEGIN 5 */

/* Infinite loop */

for(;;)

{

printf( "Tick_test_1:%d\n", xTaskGetTickCount() );

for(long i=0;i<10000;i++)

{

ptr = (int*) malloc(5*sizeof(int));

}

printf( "Tick_test_2:%d\n", xTaskGetTickCount() );

osDelay(1);

}

P

Piranha

Graduate II

Do you understand that the first printf() and (I guess) UART transmission underneath is included in your measurement? And xTaskCreate() and malloc() both use dynamic memory and are not deterministic in terms of both - processing time and success of result.

R

RBGAuthor

Visitor II

yes, I tried in other approach. Is this a better method to check the performance.

I tried to increment a variable in one tick count and the results are below.

F429 - a=976

F767 - a=691

F767 is not running as many times F429 is running through the code in specific tick.

And the situation is only task running that is this default task and code base is default simple example code taken from STM32cubemx

P

Piranha

Graduate II

Disable all interrupts (__disable_irq()/__enable_irq()) and use DWT->CYCCNT for precise measurement.

How are clocks, PLL, buses, flash and cache configured?

R

RBGAuthor

Visitor II

I tried attaching the complete code but it is not allowed here. I am attaching the main function snapshot and system clock config functions snapshot.

Code is taken from STM32CubeMX V 4.24

Firmware package versions

F429 - STM32Cube_FW_F4_V1.9.0

F767 - STM32Cube_FW_F7_V1.15.0

Nothing else is changed in that example.

Results for the below code when kept variable(a) in live watch:

F429 - a=998

F767 - a=661

P

Piranha

Graduate II

Compare the how HAL_Init() configures FLASH_ACR in both cases.

R

RBGAuthor

Visitor II

@Piranha @Uwe Bonnes

Major difference in Hal_init() is data and instruction cache and prefetch .

Tried the combinations and didn't find much diffference.

F429-with cache and prefetch enabled - a=998

F429 with cache and prefetch disabled - a=997

F767-with cache and prefetch enabled - a=661

F767 with cache and prefetch disabled - a=661

F767 is slow because of there is no data caching ?

Hal_init comparison F767-F429

F429_Flash_register_status

F767_Flash_register_status

D

Danish1

Graduate

One reason the 'F7 is can be slower because it has a longer pipeline. On any branch (function-call, if, goto, loop), any partly-executed instructions in the pipeline have to be abandoned and the new instruction sequence has to be loaded. (Inlining a function-call eliminates this.)

Why do this? Because that means the processor can be clocked at a higher frequency - if you choose to do so.

The F7 has an advantage that it can sometimes execute two instructions simultaneously, which the 'F4 cannot. This very much depends on the data dependencies between successive instructions, and it takes a clever compiler run at high optimisation-level to take full advantage of this.

What optimisation-level were you compiling at? F7 is likely to optimise better.

You will find examples where F4 wins over F7 in terms of cycle count. And you'll find examples where F7 wins.

Hope this helps,

Danish