this is an old favorite topic in the old forum
uploaded again, seasons greetings
dhrystone and whetstone benchmarks
dhrystone and whetstone benchmarks
- Attachments
-
- whetstone.zip
- whetstone benchmark
- (4.55 KiB) Downloaded 486 times
-
- dhry21a_usbserial.zip
- dhrystone benchmark
- (9.44 KiB) Downloaded 462 times
Last edited by ag123 on Fri Dec 20, 2019 4:04 pm, edited 1 time in total.
Re: dhrystone and whetstone benchmarks
Re: dhrystone and whetstone benchmarks
+1 Thanks !
Re: dhrystone and whetstone benchmarks
Here is the "original" Single Precision Whetstone benchmark modified for STM32DUINO.
It builds here with the STM core for F407.
It is my understanding the FPU is enabled in the STM core by default.
Not tested on real hw yet.
Try with 401/411 and do report bugs, plz.
It should be build with PRINTF enabled.
It builds here with the STM core for F407.
It is my understanding the FPU is enabled in the STM core by default.
Not tested on real hw yet.
Try with 401/411 and do report bugs, plz.
It should be build with PRINTF enabled.
- Attachments
-
- Whetstone_SP_STM32.zip
- (4.55 KiB) Downloaded 412 times
Pukao Hats Cleaning Services Ltd.
Re: dhrystone and whetstone benchmarks
@Pito
I saw several post around the whetstone.
Did you try the one in the STM32duino Examples library?
I saw several post around the whetstone.
Did you try the one in the STM32duino Examples library?
Re: dhrystone and whetstone benchmarks
well, i think those sp flops around 60 Mflops to 150 Mflops for the F401 84mhz, F411 96 mhz is after all real.
for one thing it seem rather close to the arm 11 vfp fpu less that 'vector' floating point
http://infocenter.arm.com/help/topic/co ... DEJJH.html
nevertheless, fp instructions probably execute at 1 flops per cycle and that there are possibly several alu e.g. separate for multiply, divide and add, plus some kind of 'speculative' (out of order) execution. this is the only way to explain the above 1 flops per hz performance on the stm32f4x cpus
the optimization using vfp libraries may also have used things like fma (floating multiply and add) instructions for the whetstone benchmarks, that does both multiply and add in a single instruction which would make matrix - vector calcs run like they do 2 flops per clock
so after all we do have pretty fast single precision floating points on our little f4 chips
these are probably more advanced than the p4 technology at that time, after all desktop intel chips these days runs more than a single flops per flop and intel does that much more extreme at 64 bits (in fact 80 bits)
https://en.wikipedia.org/wiki/Extended_precision
for one thing it seem rather close to the arm 11 vfp fpu less that 'vector' floating point
http://infocenter.arm.com/help/topic/co ... DEJJH.html
nevertheless, fp instructions probably execute at 1 flops per cycle and that there are possibly several alu e.g. separate for multiply, divide and add, plus some kind of 'speculative' (out of order) execution. this is the only way to explain the above 1 flops per hz performance on the stm32f4x cpus
the optimization using vfp libraries may also have used things like fma (floating multiply and add) instructions for the whetstone benchmarks, that does both multiply and add in a single instruction which would make matrix - vector calcs run like they do 2 flops per clock
so after all we do have pretty fast single precision floating points on our little f4 chips
these are probably more advanced than the p4 technology at that time, after all desktop intel chips these days runs more than a single flops per flop and intel does that much more extreme at 64 bits (in fact 80 bits)
https://en.wikipedia.org/wiki/Extended_precision
Last edited by ag123 on Sat Jan 25, 2020 3:17 pm, edited 2 times in total.
Re: dhrystone and whetstone benchmarks
All the Whets benchmarks people and me messed with were done with that one from your Examples.
That one provided suspicious results.
I would suggest you to remove it when confirmed my results are more "real"
Try with the one I uploaded above. It is version 0.1.
Pukao Hats Cleaning Services Ltd.
Re: dhrystone and whetstone benchmarks
@ag123: be so kind and try to run the one above on your F401.
We will get at least some results we may compare to the real world.
You should get following table (with F401 numbers):
We will get at least some results we may compare to the real world.
You should get following table (with F401 numbers):
Pukao Hats Cleaning Services Ltd.