Skip to main content
Graduate
March 12, 2024
Solved

Help needed with STM32 ASM function

  • March 12, 2024
  • 2 replies
  • 1970 views

Hallo All,

I am currently stuck and need your help. The following function counts the no. of 1 in a 32 bit word on an STM32F407. Why does it fail for values which have their bit 31 set? Precisely, it ignores bit 31 and in these cases is off by one.

 .global countBits
.text
.syntax unified
countBits:
mov r3, r0
mov r2, #32
mov r0, #0
next:
lsls r3,r3,#1
it mi
addmi r0, #1
subs r2, #1
bne next
bx lr

I would be glad if anyone could point me to the error which obviously I do not see.

Thanks a lot
Herby

    This topic has been closed for replies.
    Best answer by STM_Thirty2
    lsls r3,r3,#1

     This instruction is shifting the value in R3 to the left by 1. If the 31st bit is set on the initial input number  then you are shifting it out and it is lost.

    2 replies

    ST Employee
    March 12, 2024
    lsls r3,r3,#1

     This instruction is shifting the value in R3 to the left by 1. If the 31st bit is set on the initial input number  then you are shifting it out and it is lost.

    Graduate II
    March 12, 2024

    You're using the sign bit, which is already in the high order bit at the first iteration

     .global countBits
     .text
     .syntax unified
    countBits:
     mov r3, r0
     mov r0, #0
     orrs r3, r3
    next:
     it mi
     addmi r0, #1
     lsls r3,r3,#1
     bne next
     bx lr

    There are other faster ways of doing a population count.

    popcount32:
    	mov.w	r1, #$55555555
    	and.w	r1, r1, r0, lsr #1
    	subs	r0, r0, r1
    	mov.w	r1, #$33333333
    	and.w	r1, r1, r0, lsr #2
    	and.w	r0, r0, #$33333333
    	add	r0, r1
    	add.w	r0, r0, r0, lsr #4
    	and.w	r0, r0, #$F0F0F0F
    	add.w	r0, r0, r0, lsr #8
    	add.w	r0, r0, r0, lsr #16
    	and.w	r0, r0, #$3F
    	bx	lr
    herbertAuthor
    Graduate
    March 13, 2024

    Thanks you for your input. Yes, I know and normally use the Gillies-Miller method for sideways addition. I was just drafting this other solution to make some time measuring experiments to show to my students how much faster sideways addition is.

    Herby

    Graduate II
    March 13, 2024

    If you know where the bits are going to lay you can change the shift direction so the looping method can "finish early". The loop counting wastes time in all situations.

    I'd probably have opted for the shift into carry and use ADC r0,#0 type method and avoided the conditional execution entirely.