I'm not sure exactly how you got it to generate the assembly code you show there.webjorn wrote: Sat Sep 17, 2022 2:26 pm ...
------- C code -----------
for(k=0;k<maxdata;k++) { // fill array with alternating 0 and 1
if(( k & 1) == 0)
datarray[k] = 0x0;
else
datarray[k] = 0xffff;
}
t1 = micros(); // this does not serve any purpose now, it just generates a bl micros, which is easily found in the dump file
snapshot = *DWT_CYCCNT;
datptr = &datarray[0];
(*(uint16_t *) 0x40010c0c ) = *datptr++;
(*(uint16_t *) 0x40010c0c ) = *datptr++;
(*(uint16_t *) 0x40010c0c ) = *datptr++;
(*(uint16_t *) 0x40010c0c ) = *datptr++;
(*(uint16_t *) 0x40010c0c ) = *datptr++;
(*(uint16_t *) 0x40010c0c ) = *datptr++;
(*(uint16_t *) 0x40010c0c ) = *datptr++;
(*(uint16_t *) 0x40010c0c ) = *datptr++;
(*(uint16_t *) 0x40010c0c ) = *datptr++;
(*(uint16_t *) 0x40010c0c ) = *datptr++;
Serial.print("CYCCNT : ");
Serial.println(*DWT_CYCCNT - snapshot);
-----------------------------
THIS IS dump of the elf of the project taken with
.arduino15/packages/WeActStudio/tools/xpack-arm-none-eabi-gcc/11.2.1-1.2/bin/arm-none-eabi-objdump /tmp/arduino_build_865338/Blink-403a-out-1.ino.elf -d -S -l > dmp14.txt
/home/webjorn/Arduino/Blink-403a-out-1/Blink-403a-out-1.ino:158
snapshot = *DWT_CYCCNT;
80004b2: 4d2d ldr r5, [pc, #180] ; (8000568 <_Z4loopv+0x34c>)
/home/webjorn/Arduino/Blink-403a-out-1/Blink-403a-out-1.ino:157
t1 = micros();
80004b4: f000 f9de bl 8000874 <micros>
/home/webjorn/Arduino/Blink-403a-out-1/Blink-403a-out-1.ino:169
(*(uint16_t *) 0x40010c0c ) = *datptr++;
80004b8: 4b2e ldr r3, [pc, #184] ; (8000574 <_Z4loopv+0x358>)
/home/webjorn/Arduino/Blink-403a-out-1/Blink-403a-out-1.ino:158
snapshot = *DWT_CYCCNT;
80004ba: 686f ldr r7, [r5, #4]
/home/webjorn/Arduino/Blink-403a-out-1/Blink-403a-out-1.ino:169
(*(uint16_t *) 0x40010c0c ) = *datptr++;
80004bc: f8c6 3fa0 str.w r3, [r6, #4000] ; 0xfa0
80004c0: 4b1c ldr r3, [pc, #112] ; (8000534 <_Z4loopv+0x318>)
80004c2: 8a72 ldrh r2, [r6, #18]
80004c4: 819a strh r2, [r3, #12]
/home/webjorn/Arduino/Blink-403a-out-1/Blink-403a-out-1.ino:170
Serial.print("CYCCNT : ");
80004c6: 492c ldr r1, [pc, #176] ; (8000578 <_Z4loopv+0x35c>)
80004c8: 4620 mov r0, r4
80004ca: f000 fda0 bl 800100e <_ZN5Print5printEPKc>
/home/webjorn/Arduino/Blink-403a-out-1/Blink-403a-out-1.ino:171
Serial.println(*DWT_CYCCNT - snapshot);
80004ce: 6869 ldr r1, [r5, #4]
80004d0: 220a movs r2, #10
80004d2: 1bc9 subs r1, r1, r7
80004d4: 4620 mov r0, r4
80004d6: f000 fdfe bl 80010d6 <_ZN5Print7printlnEmi>
/home/webjorn/Arduino/Blink-403a-out-1/Blink-403a-out-1.ino:147
digitalWrite(PC13, HIGH); // turn the LED on (HIGH is the voltage level)
80004da: e7d1 b.n 8000480 <_Z4loopv+0x264>
--------- end of dump
...
I just tried your code repeating this
Code: Select all
(*(uint16_t *) 0x40010c0c ) = *datptr++;
It tells me that CYCCNT is 4. No, not 4 per move, 4 in total for the 10 moves! Now, that sounds pretty impressive, until I looked at the generated code, and it turns out that all 10 moves were optimised out. This is another hazard of rolling you own definition for GPIOB->ODR and not declaring it as volatile. Also, GPIOB->ODR is uint32_t, not uint16_t, which will influence the assembly instructions generated.
Unless you turn optimisation off completely, the compiler may re-order your statements, and might even decide some are not needed at all. All this can make it hard to follow the generated code. But then, if you do turn optimisation off, it can generate some very slow code.