Page 9 of 10
Re: dhrystone and whetstone benchmarks
Posted: Sun Jan 26, 2020 7:28 pm
by Bingo600
Damm ...
".a" files ==> Compiled libraries
No source ..
Well i'd prob not understand the "magic" anyway
/Bingo
Re: dhrystone and whetstone benchmarks
Posted: Sun Jan 26, 2020 7:49 pm
by Bingo600
Mispost
Re: dhrystone and whetstone benchmarks
Posted: Sun Jan 26, 2020 7:50 pm
by Bingo600
What does this flag do ??
ST Core does not specify :
Code: Select all
GenF4.build.flags.fp=-mfpu=fpv4-sp-d16 -mfloat-abi=hard
F411 Built as cortex-m7
No -fsingle-precision-constant
-O2
Fast N1 , slow N5 , slow MWIPS
Code: Select all
##########################################
Single Precision C Whetstone Benchmark
Calibrate
0.15 Seconds 1 Passes (x 100)
0.77 Seconds 5 Passes (x 100)
3.86 Seconds 25 Passes (x 100)
Use 64 passes (x 100)
Single Precision C/C++ Whetstone Benchmark
Loop content Result MFLOPS MOPS Seconds
N1 floating point -1.12475013732910156 48.000 0.026
N2 floating point -1.12274742126464844 33.340 0.258
N3 if then else 1.00000000000000000 94.628 0.070
N4 fixed point 12.00000000000000000 160.001 0.126
N5 sin,cos etc. 0.49909299612045288 0.924 5.764
N6 floating point 0.99999982118606567 31.994 1.079
N7 assignments 3.00000000000000000 41.210 0.287
N8 exp,sqrt etc. 0.75110614299774170 1.052 2.264
MWIPS 64.819 9.874
-fsingle-precision-constant
-O2
Slow(er) N1 , fast N5 , fast MWIPS
Code: Select all
##########################################
Single Precision C Whetstone Benchmark
Calibrate
0.10 Seconds 1 Passes (x 100)
0.50 Seconds 5 Passes (x 100)
2.51 Seconds 25 Passes (x 100)
Use 99 passes (x 100)
Single Precision C/C++ Whetstone Benchmark
Loop content Result MFLOPS MOPS Seconds
N1 floating point -1.12475013732910156 45.150 0.042
N2 floating point -1.12274742126464844 33.347 0.399
N3 if then else 1.00000000000000000 95.761 0.107
N4 fixed point 12.00000000000000000 143.710 0.217
N5 sin,cos etc. 0.49909299612045288 2.268 3.631
N6 floating point 0.99999982118606567 33.862 1.577
N7 assignments 3.00000000000000000 41.113 0.445
N8 exp,sqrt etc. 0.75110614299774170 1.043 3.532
MWIPS 99.496 9.950
F411 built normal s a m4
-O2
-fsingle-precision-constant
Code: Select all
##########################################
Single Precision C Whetstone Benchmark
Calibrate
0.10 Seconds 1 Passes (x 100)
0.50 Seconds 5 Passes (x 100)
2.48 Seconds 25 Passes (x 100)
Use 100 passes (x 100)
Single Precision C/C++ Whetstone Benchmark
Loop content Result MFLOPS MOPS Seconds
N1 floating point -1.12475013732910156 46.489 0.041
N2 floating point -1.12274742126464844 33.350 0.403
N3 if then else 1.00000000000000000 95.833 0.108
N4 fixed point 12.00000000000000000 159.898 0.197
N5 sin,cos etc. 0.49909299612045288 2.261 3.680
N6 floating point 0.99999982118606567 35.984 1.499
N7 assignments 3.00000000000000000 41.158 0.449
N8 exp,sqrt etc. 0.75110614299774170 1.051 3.538
MWIPS 100.854 9.915
ST Core - Default
-O2
No -fsingle-precision-constant
Code: Select all
##########################################
Single Precision C Whetstone Benchmark
Calibrate
0.15 Seconds 1 Passes (x 100)
0.77 Seconds 5 Passes (x 100)
3.85 Seconds 25 Passes (x 100)
Use 64 passes (x 100)
Single Precision C/C++ Whetstone Benchmark
Loop content Result MFLOPS MOPS Seconds
N1 floating point -1.12475013732910156 46.545 0.026
N2 floating point -1.12274742126464844 33.340 0.258
N3 if then else 1.00000000000000000 96.000 0.069
N4 fixed point 12.00000000000000000 180.001 0.112
N5 sin,cos etc. 0.49909299612045288 0.915 5.817
N6 floating point 0.99999982118606567 33.878 1.019
N7 assignments 3.00000000000000000 41.210 0.287
N8 exp,sqrt etc. 0.75110614299774170 1.051 2.265
MWIPS 64.952 9.853
arm-gcc docs says
https://gcc.gnu.org/onlinedocs/gcc-4.7. ... tions.html
Code: Select all
-fsingle-precision-constant
Treat floating-point constants as single precision instead of implicitly converting them to double-precision constants.
Re: dhrystone and whetstone benchmarks
Posted: Sun Jan 26, 2020 8:37 pm
by Bingo600
Well this day went totally cortex + FPU
But at lest i got BLACKPILL_F411CE implemented in the ST Core @96MHz + ART Enabled

- BPF411CE.png (63.62 KiB) Viewed 8862 times
I ran my first program on my "Ali Black-F407"
I got my DISCO-F407 dusted off
And i learned a lot about the boards.txt & variants directories
I'm off ... ZZZzzzzzz
Goodbye & Thanx for "all the fish" (help/comments)
/Bingo
Re: dhrystone and whetstone benchmarks
Posted: Sun Jan 26, 2020 9:54 pm
by Pito
Bingo600 wrote: Sun Jan 26, 2020 7:28 pm
Damm ...
".a" files ==> Compiled libraries
No source ..
Well i'd prob not understand the "magic" anyway
/Bingo
https://github.com/ARM-software/CMSIS_5 ... DSP/Source
Re: dhrystone and whetstone benchmarks
Posted: Sat Jul 04, 2020 8:00 pm
by dannyf
my attempt, at benchmarking a variety of chips, and outside of the typical dhrystone / whetstone routines:
https://dannyelectronics.wordpress.com/ ... -exercise/
Interesting to see how the chips fall. the numbers are measured in cycle counts.
Re: dhrystone and whetstone benchmarks
Posted: Mon Nov 08, 2021 5:37 pm
by ag123
new benchmark STM32H743VIT6 - 480 Mhz
viewtopic.php?p=8886#p8886
Code: Select all
Beginning Whetstone benchmark at 480 MHz ...
Loops:10000, Iterations:1, Duration:1203.82 millisec
C Converted Single Precision Whetstones:830.69 Mflops
Beginning Whetstone benchmark at 480 MHz ...
Loops:10000, Iterations:1, Duration:1203.48 millisec
C Converted Single Precision Whetstones:830.93 Mflops
-O2 optimised, cache turned on
that 'doubled' fpu speeds is likely real as the FPU in all the series from F4 to F7, H7 is done using the VFP processor. i.e. vector floating point
so in a single instruction it can process 2 lanes of data - vectorized.
Re: dhrystone and whetstone benchmarks
Posted: Mon Nov 08, 2021 6:13 pm
by spiceagent11
the optimization using vfp libraries may also have used things like fma (floating multiply and add) instructions for the whetstone benchmarks, that does both multiply and add in a single instruction which would make matrix - vector calcs run like they do 2 flops per clock
so after all we do have pretty fast single precision floating points
VidMate | Bluestacks 3 | hdmoviearea
Re: dhrystone and whetstone benchmarks
Posted: Mon Nov 08, 2021 6:35 pm
by ag123
Yup, i recently ventured reading some stuff about VFP, this is a different 'generation' of FPU, things like fused multiply and add are common, hence, the whetstone benchmark could have been boosted by that alone. And with vectorized 2 lanes, when the compiler compiles the codes that way, every floating-point instruction basically runs in parallel in 1 clock cycle. That makes for the 'apparent' high speeds, the speeds are likely real, but accordingly vector floating point suffers from non-compliance to ieee754 in its rounding (flush to zero) and saturating arithmetic.
Re: dhrystone and whetstone benchmarks
Posted: Tue Dec 07, 2021 8:55 pm
by richieadam
When looking at the historical whetstone result data - the big CPUs get higher MWIPS, but much lower MFLOPs at the same clock.
Also I do not understand you got 100 MWIPS @96MHz (411) and @168MHz (407) we get 98.
There must be a subtle bug in the code somewhere.
The historical results show nice linear dependency MWIPS on clock.
I do assume we should see something like 150 MWIPS @168MHz (407).
I would suggest small changes in the code, wait..
In the original code it was double
Code: Select all
double theseSecs = 0.0;
double startSecs = 0.0;
double secs = 0.0;
Also you may use micros() to get better resolution
Code: Select all
void getSecs()
{
theseSecs = micros() / 1000000.0;
return;
}
I doubt it helps, however, you may try..
PS: is the ART enabled in the STM core ????