Skip to main content
Graduate II
May 12, 2025
Question

M0 in-line assembler moving from bootloader to main app

  • May 12, 2025
  • 11 replies
  • 1542 views

Folks,

I have a bootloader for an M0 which works without using interrupts. It works in that it loads the new app, the CRC in the new app matches and I then try and jump to 0x8004000.

In the past I have used a couple of different methods with M3/4 to boot to the app from bootloader dependent on what compiler/IDE I was using, but the M0 is slightly different due to how the vector table is not relocatable. So we've come up with a method to get round that and "all" I need to do is jump to 0x8004000.

I *think I need something like this:

void jump_to_normal_application(void)
{

 u32 jump_address;

 jump_address = *(volatile u32*) (MAIN_PROG_START_ADDRESS + 4);
 jump_to_application = (pFunction) jump_address;

 asm("ldr sp, =0x20002000");

 jump_to_application();

}


I'm getting the error:

Error: lo register required -- `ldr sp,=0x20002000'

I've searched this and although I can find many hits  I can't see anything vaguely relevant to my situation.

Also, I would like to make 0x20002000 on line 9 top be a #define or loading from a C variable


Can anyone help me with this please? __Set_MSP is not available to me due to the, errr, "individual" setup I am forced to use.

What is the way to do this and are any of my assumptions above even vaguely correct? Am using gcc.

    This topic has been closed for replies.

    11 replies

    Technical Moderator
    May 20, 2025

    Hi @DiBosco ,

    In the article How to share an API between a bootloader and an application, STM32G0 is used.

    I assume that you can follow the same steps as both F0 & G0 are based on Cortex-M0.

    Try it and let us know if it was helpful, or may be it is missing something.

    -Amel

    Graduate II
    May 20, 2025

    You can't load SP directly due to the reduced ISA of the CM0

    You must load the value to R0, then MOV SP, R0

    On the F0 you need to copy the vector table to the base of RAM and remap that memory to the zero address space

    Graduate II
    May 20, 2025

    Look at the IAP examples

    You can set the SP in the receiving end Reset_Handler as it's not coming back.

     

    Graduate
    May 20, 2025

    Myguess one line more could be enought:

     

    void jump_to_normal_application(void)
    {
    
     u32 jump_address;
    
     jump_address = *(volatile u32*) (MAIN_PROG_START_ADDRESS + 4);
     jump_to_application = (pFunction) jump_address;
    
     asm("ldr r0, =0x20002000");
     asm("movs sp, r0");
    
     jump_to_application();
    
    }

     

    And much easier can be done from the assembler than C make jump to any address like for example start addres any procedure/function. Also you can do it a bit easier when you create any file with ".S" extension where you can write wthout quote marks and asm directives - just look at the startupxxxx.s file in the generated project. I'm not sure how ldr r0, =0x20002000 can work in inline injection, maybe beter could be done own procedure for this like:

     

    // any file with ".S" extension in folder with sources
     .syntax unified
     .cpu cortex-m0 // or which one you used
     .fpu softvfp
     .thumb
    
     .thumb_func
     .global set_stack
     .type set_stack , %function
    
    
     .text
    //----------------
    
    //======================================================
    // set stack pointer procedure
    //====================================
     // INPUT:
     // R0, = ; new stack value with AACP standard
    
     // output: SP reinitialized
    //=======================================================
    //=========================
    set_stack:
     // or LDR R0, = 0x20002000
     MOV SP, R0 // set the stack
    //---
     BX LR // go back to invoked place
    //=============================================================

     

    This procedure should be invoked from C like:

    extern void set_stack( uint32_t address);
    
    set_stack(0x20002000);
    
    // now we can continue C run code like JumToApplication(); and so on

     

    What I want notice - some things from assembler are much much easier to do than in C...

     

     

     

    DiBoscoAuthor
    Graduate II
    May 20, 2025

    Thanks for all the sudden replies! :)

    We don't need to do any vector table remapping. We're doing a funky thing whereby we're using the bootloader vector table to simply jump to address + bootloader offset and not using interrupts in the bootloader.

    @Amel NASRI 

    That example was not helpful as is uses __Set_MSP function which, as I said, I do not have access to.

    @wegi01  I think I tried this:

     asm("ldr r0, =0x20002000");
     asm("movs sp, r0");

     

    But I shall try that again, just in case. Not sure I agree that having an extra .s file is easier, but at least that would make it easy to have a macro for the address rather than a magic number.

     

    Graduate II
    May 20, 2025

    Changing the SP mid-function will break auto/local variables on the stack.

    You won't get assembler errors/warnings about using lower order registers if you do it correctly. 

    Graduate
    May 20, 2025

    @Tesla DeLorean 

    "Changing the SP mid-function will break auto/local variables on the stack."

     

    Definatelly it's a true. But I assume the user know what hi doing and I see what hi want to doing:
    Hi want running the other program and before this hi want reinitialize the stack pointer what in this case is absolutly uderstandable. All variables from stack going to the hell when hi running other program.

     

    @DiBosco 

    I'd check how it look with injection asm directive - probably it work correctly - the C code looks like:

     

    int main(void)
    {
    
     /* USER CODE BEGIN 1 */
    
     /* USER CODE END 1 */
    
     /* MCU Configuration--------------------------------------------------------*/
    
     /* Reset of all peripherals, Initializes the Flash interface and the Systick. */
     HAL_Init();
    
     /* USER CODE BEGIN Init */
    
     /* USER CODE END Init */
    
     /* Configure the system clock */
     SystemClock_Config();
    
     /* USER CODE BEGIN SysInit */
    
     /* USER CODE END SysInit */
    
     /* Initialize all configured peripherals */
     /* USER CODE BEGIN 2 */
     asm("ldr r0, =0x20002000");
     asm("mov sp, r0");
     /* USER CODE END 2 */
    
     /* Infinite loop */
     /* USER CODE BEGIN WHILE */
     while (1)
     {
     /* USER CODE END WHILE */
    
     /* USER CODE BEGIN 3 */
     }
     /* USER CODE END 3 */
    }

     

    After translation to assembler looks OK:

     

     

    08000284 <main>:
     8000284:	b510 	push	{r4, lr}
     8000286:	f000 f881 	bl	800038c <HAL_Init>
     800028a:	f7ff ffc9 	bl	8000220 <SystemClock_Config>
     800028e:	4801 	ldr	r0, [pc, #4]	@ (8000294 <main+0x10>) take into R0 new stack pointer value from the en of function area
     8000290:	4685 	mov	sp, r0 // transfer R0 to SP
     8000292:	e7fe 	b.n	8000292 <main+0xe> while(1) - neverending loop
     8000294:	20002000 	.word	0x20002000 // our new stack pointer value

     

    Notice the 0x08000294 word whats incomming from ARM architecture...

     

     

     

     

     

    Super User
    May 20, 2025

    From CMSIS\Include\cmsis_gcc.h

    /**
     \brief Set Main Stack Pointer
     \details Assigns the given value to the Main Stack Pointer (MSP).
     \param [in] topOfMainStack Main Stack Pointer value to set
     */
    __STATIC_FORCEINLINE void __set_MSP(uint32_t topOfMainStack)
    {
     __ASM volatile ("MSR msp, %0" : : "r" (topOfMainStack) : );
    }

    where __STATIC_FORCEINLINE is __attribute__((always_inline)) static inline

    But the jump_to_application variable may be stored on the stack. What do you think will happen to it after change of the SP?

     

    DiBoscoAuthor
    Graduate II
    May 20, 2025

    I can confirm, that I had already tried

     

     
    static u32 jump_address;
    typedef void (*pFunction)(void);
    pFunction jump_to_application;
    
    void jump_to_normal_application(void)
    {
    
    jump_address = *(volatile u32*) (MAIN_PROG_START_ADDRESS + 4);
    
    jump_to_application = (pFunction) jump_address;
    
    
    asm("ldr r0, =0x20002000");
    
    asm("mov sp, r0");
    
    
    jump_to_application();
    
    
    }

     

    This does compile happily, but I just get a hard fault error as soon as I hit

    jump_to_application();

    I should have posted that I had tried a whole raft of other things after my original post, but simply forgot to come back here to post findings. At this point I'd just given up.

     

    Graduate II
    May 20, 2025

    >>I can conform, that I had already tried

    Then STOP trying to solve the "Problem" this way..

    The normative way of addressing multiple firmware images on the F0 / CM0 is to copy an instance of the correctly built vector table (ie address appropriate, bound / fixed by the linker) into the base of RAM 0x20000000 and using SYSCFG to REMAP FLASH vs ROM vs RAM to address 0x00000000 where the MCU is loading the vectors from. This way interrupts all work properly.

    The startup.s code for the application portion starts with Reset_Handler doing the equivalent of

    Reset_Handler:
     ldr r0, =__initial_sp ; 0x20002000
     movs r0, sp
    ...
     LDR R0, =SystemInit
     BLX R0
     LDR R0, =__main
     BX R0

    STM32Cube_FW_F0_V1.10.1\Projects\STM32091C_EVAL\Applications\IAP

     /* Relocate by software the vector table to the internal SRAM at 0x20000000 ***/
    
     /* Copy the vector table from the Flash (mapped at the base of the application
     load address 0x08004000) to the base address of the SRAM at 0x20000000. */
     for(i = 0; i < 48; i++)
     {
     VectorTable[i] = *(__IO uint32_t*)(APPLICATION_ADDRESS + (i<<2));
     }
    
     /* Enable the SYSCFG peripheral clock*/
     __HAL_RCC_SYSCFG_CLK_ENABLE();
     /* Remap SRAM at 0x00000000 */
     __HAL_SYSCFG_REMAPMEMORY_SRAM();

     

    DiBoscoAuthor
    Graduate II
    May 20, 2025

    Well, it's a way to do it.

    As I say, we're not using interrupts on this bootloader. At all.

     

    Graduate
    May 20, 2025

    @DiBosco 

     

    void jump_to_normal_application(void)
    {
    
     u32 jump_address;
    
     jump_address = *(volatile u32*) (MAIN_PROG_START_ADDRESS + 4);
     jump_to_application = (pFunction) jump_address;
    
     asm("ldr r0, =0x20002000");
     asm("movs sp, r0");
     asm("ldr r1, =jump_address");
     asm("ldr r0, [r1]");
     push r0
     pop pc
    
    
    
    // jump_to_application();
    }
    DiBoscoAuthor
    Graduate II
    May 20, 2025

    @wegi01 

    Using

    asm("ldr r0, =0x20002000");
    
    asm("mov sp, r0");
    
    asm("ldr r1, =0x8004004");
    
    asm("ldr r0, [r1]");
    
    asm("push {r0}");
    asm("pop {pc}");


    Has it compiling and you can see it then jumps to above 0x8004000. I need to get the second elf file loaded into the debugger to check what happens then, but that looks promising, thank you.

    It isn't happy with using "jump_address" in the assembly language, but I can live with that/use a .s file method you suggested. Many thanks.

    Graduate
    May 20, 2025

    If you load immediatelly this value 0x08004004 you should to do like before with stack - so:

    asm("ldr r0, =0x20002000");
    asm("mov sp, r0");
    
    //asm("ldr r1, =0x8004004");
    // asm("ldr r0, [r1]"); not in this case !!
    
    asm("ldr r0, =0x8004004");
    asm("bx R0"); // without push/pop

     

    Kudos :D

    Graduate II
    May 20, 2025

    asm("ldr r0, =0x8004004");
    asm("bx R0"); // without push/pop

    That will Hard Fault for sure, even if there code there rather than a jump table.

    EVEN addresses are assumed to be 32-bit ARM code, which the CMx MCU's cannot execute. The CM0 will also fault if you read unaligned data.

    Entry points deeper into the image are probably best done in a consistent tabular form, so you're not making the loader aware of symbols in an unrelated image. And that you can add others along the way.

    DiBoscoAuthor
    Graduate II
    May 20, 2025

    Again, I'll say that I am forced to use a very unusual setup, with a custom from-scratch makefile, custom headers, custom linker file, all created using emacs. I have to import the make file project into Cube so I can use a debugger with breakpoints and single stepping.

    I simply cannot use standard tools, I cannot follow the IAP examples because they use a number of things I just don't have access to.

    And yes, absolutely no interrupts. None, nada, zip, zero, zilch, rien; not even systick. All interrupts are off. 100% unused, dormant, unnecessary.

     

     

     

    Graduate
    May 20, 2025

    This is super ultra simple things - running MCU from required address in assembler - much more you make self tired by adding so many definitions and typecastings in C.

    Without problem should be working in depends what is under 0x08004004 - pointer to running program or immediatelly code - so you know before how to treat it - I'd show. Go back here and tell to as "IT WORK"