Skip to main content
Visitor II
February 16, 2021
Question

What's the correct way to write THUMB/ARM instructions mixed with LL Drivers for the STM32F334?

  • February 16, 2021
  • 4 replies
  • 2284 views

I'm writting a very time-critical application for the STM32F334, where I have to control 4 power converters (mostly 3, one is time-shared another power converter). And recently I saw some ways to optimize code, like this:

//*********** 1360ns ****************

LL_HRTIM_TIM_SetCompare3(HRTIM1, LL_HRTIM_TIMER_A, DEF_TMRA_STATIC_MAX_PWM_CTE+i_CalcDutyA);

LL_HRTIM_TIM_SetCompare1(HRTIM1, LL_HRTIM_TIMER_A, DEF_TMRA_STATIC_MAX_PWM_CTE-i_CalcDutyA);

LL_HRTIM_TIM_SetCompare3(HRTIM1, LL_HRTIM_TIMER_B, DEF_TMRA_STATIC_MAX_PWM_CTE+i_CalcDutyB);

LL_HRTIM_TIM_SetCompare1(HRTIM1, LL_HRTIM_TIMER_B, DEF_TMRA_STATIC_MAX_PWM_CTE-i_CalcDutyB);

//********* Optimized version **** 500ns or less

HRTIM1->sTimerxRegs[HRTIM_TIMERINDEX_TIMER_A].CMP3xR = DEF_TMRA_STATIC_MAX_PWM_CTE+i_CalcDutyA;

HRTIM1->sTimerxRegs[HRTIM_TIMERINDEX_TIMER_A].CMP1xR = DEF_TMRA_STATIC_MAX_PWM_CTE-i_CalcDutyA;

HRTIM1->sTimerxRegs[HRTIM_TIMERINDEX_TIMER_B].CMP3xR = DEF_TMRA_STATIC_MAX_PWM_CTE+i_CalcDutyB;

HRTIM1->sTimerxRegs[HRTIM_TIMERINDEX_TIMER_B].CMP1xR = DEF_TMRA_STATIC_MAX_PWM_CTE-i_CalcDutyB;

Now I'm looking for better/other optimization strategies (like writing Thumb/ARM assembly directly, as is already being done with mostly, if any, DSP that I've worked on), so the question is: how to do it in a correct way to mix with C/LL (C syntax) ? Any documentation available ?

(For example, a fastPID routine on this STM32F334 (72MHz) uses about 1000ns - 16bit Q15 fixed point arithmetic operations; on a dsPIC33 the same routine takes *only* 500ns - running at a theoretically lower clock speed - 40MIPs)

Thanks for any help.

    This topic has been closed for replies.

    4 replies

    Explorer
    February 16, 2021

    Both the assembler syntax and the interface between C code and assembler routines (parameter handling) depends on the toolchain you are using.

    This is not C standard.

    LS. B.1Author
    Visitor II
    February 16, 2021

    I'm using Ac6's SystemWorkbench / Eclipse IDE + CubeMX.

    In the future i'll migrate to SMT32CubeIDE.

    LDSB.

    Super User
    February 16, 2021

    As @Ozone​ says, the syntax for inline/embedded assembler within 'C' code is entirely compiler dependent.

    However ARM do define the ABI (Application Binary Interface) - so that should be compatible across compliant toolchains.

    https://developer.arm.com/architectures/system-architectures/software-standards/abi

    The syntax, rules, restrictions, etc for inline/embedded assembler (even the name varies!) tend to be very arcane - therefore I would strongly suggest that you make a separately-built assembler module that you call from 'C'. Then at least the 'C' remains standard & portable.

    https://www.avrfreaks.net/comment/2800826#comment-2800826

    Explorer
    February 16, 2021

    Basically all toolchain suppliers stick to ARM's ABI, that is correct.

    The biggest problem are usually the core registers you are using in the assembler code.

    As already noted, the syntax to confer your intended usage to the compiler is rather arcane.

    I personally did no very much assembler coding, and found it quite hard to beat the compiler in efficiency, at least at higher optimisation levels.

    LS. B.1Author
    Visitor II
    February 16, 2021

    Allright, I did found some interesting stuff (based on Andrew Neil links):

    Releases · ARM-software/abi-aa · GitHub

    • Procedure Call Standard for the Arm Architecture - pdfhtml
    • Run-time ABI for the Arm Architecture - pdfhtml

    Writing a separate ".s" file for a very specific part of the code sometimes can be useful, as painful to do...

    So i'd like to employ ST's/ARM based C code syntax for such job, like this one here ... (asm functions located at cmsis_gcc.h)

    ...

     /* Derived coefficient A0 */

     S->A0 = __QADD16(__QADD16(S->Kp, S->Ki), S->Kd);

    ...

    Thanks for the help!

    LDSB

    Graduate II
    February 16, 2021

    Look at the code the compiler is generating.

    Make sure it is in-lining properly, and look for algorithmic shortcuts.

    Doing embedded-inline assembler is always a bane to portability, and compiler developers keep changing the rules to accommodate themselves.

    Where possible get your critical code into a .s file where you can control the alignments, branching, unrolling, and literal pools.

    Compilers can handle complex register juggling and finding the queen, but when it comes to balancing critical code and algorithms they are pretty shallow.