STM32N657: how to measure execution time with cycle accuracy
the best I achieve so far using DWT as well as PMU is 'few cycles' accuracy. My question is, how I measure execution time such that I get 1 cycle difference when I insert a nop in the measured code ? (assuming interrupts and caches are disabled).
Both functions below return 1!
.global test_pmu0
.type test_pmu0, %function
.align 2
test_pmu0:
isb sy
dsb sy
ldr r3, .test_pmu0.pmu_base
ldr r0, [r3, #0x7C] //read PMU->CCNTR
ldr r1, [r3, #0x7C] //read PMU->CCNTR
sub r0,r1,r0
bx lr
.align 2
.test_pmu0.pmu_base:
.word 0xE0003000 //PMU base address
.global test_pmu1
.type test_pmu1, %function
.align 2
test_pmu1:
isb sy
dsb sy
ldr r3, .test_pmu1.pmu_base
ldr r0, [r3, #0x7C] //read PMU->CCNTR
nop
ldr r1, [r3, #0x7C] //read PMU->CCNTR
sub r0,r1,r0
bx lr
.align 2
.test_pmu1.pmu_base:
.word 0xE0003000 //PMU base address
