Page 8 of 10
Re: dhrystone and whetstone benchmarks
Posted: Sun Jan 26, 2020 6:23 pm
by Pito
A rather long thread on how we messed with stm32F407 FPU in 2017
https://web.archive.org/web/20170715084 ... c23d399868
There are some gcc settings recommended, like
Code: Select all
-mfloat-abi=hard -mfpu=fpv4-sp-d16 -fsingle-precision-constant
CMSIS FFT with pictures (@240MHz):
https://web.archive.org/web/20190316193 ... =80#p26827
Btw, the last post is from an user ag123

Re: dhrystone and whetstone benchmarks
Posted: Sun Jan 26, 2020 6:25 pm
by Bingo600
So the only difference between a m4 & a m7 is m4 = 3 stage pipeline & m7 = 6 stage pipeline.
m4 & m7 both can have an optional sp-fpu , and m7 can have an optional/additional dp-fpu.
/Bingo
Re: dhrystone and whetstone benchmarks
Posted: Sun Jan 26, 2020 6:32 pm
by fpiSTM
Maybe the cmsis library helps on this...
Re: dhrystone and whetstone benchmarks
Posted: Sun Jan 26, 2020 6:33 pm
by Pito
Another argument for -Os is when you start experiments with those hw switches. With higher optimisation you may lose track.
Re: dhrystone and whetstone benchmarks
Posted: Sun Jan 26, 2020 6:33 pm
by Bingo600
But according to the ARM or arm-gcc doc it is NOT wrong to specify
Code: Select all
-march=armv7e-m+fp. -mfloat-abi=hard -mfpu=fpv4-sp-d16 -fsingle-precision-constant
We might have to add
to the above , in order confuse arm-gcc (thinking it's a cortex-m7)
But unless someone "speculates" in coding against the pipeline , then
it seems like a m4 & a m7 should be instruction compatible.
But for now i do agree we should leave the ST Core untouched , and use the default
Code: Select all
-mfloat-abi=hard -mfpu=fpv4-sp-d16 -fsingle-precision-constant
/Bingo
Re: dhrystone and whetstone benchmarks
Posted: Sun Jan 26, 2020 6:35 pm
by Bingo600
fpiSTM wrote: Sun Jan 26, 2020 6:32 pm
Maybe the cmsis library helps on this...
You mean this one ??
GenF4.build.cmsis_lib_gcc=arm_cortexM4lf_math
Seems like the F7 uses this one
cmsis_lib_gcc=arm_cortexM7lfsp_math
Is the source available for those math libs ??
/Bingo
Re: dhrystone and whetstone benchmarks
Posted: Sun Jan 26, 2020 6:44 pm
by Pito
Look at my above link - I did CMSIS FFT on 407, you may download sources of libraries for DSP and other math with FPU.
https://www.st.com/content/st_com/en/pr ... t-software
https://developer.arm.com/tools-and-sof ... dded/cmsis
Re: dhrystone and whetstone benchmarks
Posted: Sun Jan 26, 2020 7:05 pm
by ag123
i think i saw this table somewhere prior
https://wiki.gentoo.org/wiki/User:Daemon/Sandbox/ARM
Code: Select all
* Architecture options usage *
This toolchain is built and optimized for Cortex-A/R/M bare metal development.
the following table shows how to invoke GCC/G++ with correct command line
options for variants of Cortex-A/R and Cortex-M architectures.
--------------------------------------------------------------------------
| Arm core | Command Line Options | multilib |
|------------|--------------------------------------------|--------------|
| Cortex-M0+ | -mthumb -mcpu=cortex-m0plus | thumb |
| Cortex-M0 | -mthumb -mcpu=cortex-m0 | /v6-m |
| Cortex-M1 | -mthumb -mcpu=cortex-m1 | /nofp |
| |--------------------------------------------| |
| | -mthumb -march=armv6-m | |
|------------|--------------------------------------------|--------------|
| Cortex-M3 | -mthumb -mcpu=cortex-m3 | thumb |
| |--------------------------------------------| /v7-m |
| | -mthumb -march=armv7-m | /nofp |
|------------|--------------------------------------------|--------------|
| Cortex-M4 | -mthumb -mcpu=cortex-m4 | thumb |
| (No FP) |--------------------------------------------| /v7e-m |
| | -mthumb -march=armv7e-m | /nofp |
|------------|--------------------------------------------|--------------|
| Cortex-M4 | -mthumb -mcpu=cortex-m4 -mfloat-abi=softfp | thumb |
| (Soft FP) | -mfpu=fpv4-sp-d16 | /v7e-m+fp |
| |--------------------------------------------| /softfp |
| | -mthumb -march=armv7e-m -mfloat-abi=softfp | |
| | -mfpu=fpv4-sp-d16 | |
|------------|--------------------------------------------|--------------|
| Cortex-M4 | -mthumb -mcpu=cortex-m4 -mfloat-abi=hard | thumb |
| (Hard FP) | -mfpu=fpv4-sp-d16 | /v7e-m+fp |
| |--------------------------------------------| /hard |
| | -mthumb -march=armv7e-m -mfloat-abi=hard | |
| | -mfpu=fpv4-sp-d16 | |
|------------|--------------------------------------------|--------------|
| Cortex-M7 | -mthumb -mcpu=cortex-m7 | thumb |
| (No FP) |--------------------------------------------| /v7e-m |
| | -mthumb -march=armv7e-m | /nofp |
|------------|--------------------------------------------|--------------|
| Cortex-M7 | -mthumb -mcpu=cortex-m7 -mfloat-abi=softfp | thumb |
| (Soft FP) | -mfpu=fpv5-sp-d16 | /v7e-m+fp |
| |--------------------------------------------| /softfp |
| | -mthumb -march=armv7e-m -mfloat-abi=softfp | |
| | -mfpu=fpv5-sp-d16 | |
| |--------------------------------------------|--------------|
| | -mthumb -mcpu=cortex-m7 -mfloat-abi=softfp | thumb |
| | -mfpu=fpv5-d16 | /v7e-m+dp |
| |--------------------------------------------| /softfp |
| | -mthumb -march=armv7e-m -mfloat-abi=softfp | |
| | -mfpu=fpv5-d16 | |
|------------|--------------------------------------------|--------------|
| Cortex-M7 | -mthumb -mcpu=cortex-m7 -mfloat-abi=hard | thumb |
| (Hard FP) | -mfpu=fpv5-sp-d16 | /v7e-m+fp |
| |--------------------------------------------| /hard |
| | -mthumb -march=armv7e-m -mfloat-abi=hard | |
| | -mfpu=fpv5-sp-d16 | |
| |--------------------------------------------|--------------|
| | -mthumb -mcpu=cortex-m7 -mfloat-abi=hard | thumb |
| | -mfpu=fpv5-d16 | /v7e-m+dp |
| |--------------------------------------------| /hard |
| | -mthumb -march=armv7e-m -mfloat-abi=hard | |
| | -mfpu=fpv5-d16 | |
|------------|--------------------------------------------|--------------|
...
well found it in the gcc-arm-none-eabi installed directory in share/doc directory
it is the readme.txt file
Re: dhrystone and whetstone benchmarks
Posted: Sun Jan 26, 2020 7:08 pm
by Bingo600
I had to ....
Built F411 but as a cortex-m7
boards.txt
Code: Select all
GenF4.name=Generic STM32F4 series
GenF4.build.vid=0x0483
GenF4.build.core=arduino
GenF4.build.board=GenF4
GenF4.build.extra_flags=-D{build.product_line} {build.enable_usb} {build.xSerial} {build.bootloader_flags}
#GenF4.build.mcu=cortex-m4
GenF4.build.mcu=cortex-m7
GenF4.build.flags.fp=-mfpu=fpv4-sp-d16 -mfloat-abi=hard
#GenF4.build.flags.fp=-march=armv7e-m+fp+fpv5+fp.dp -mfloat-abi=hard -mfpu=fpv5-d16 -fsingle-precision-constant
GenF4.build.series=STM32F4xx
#GenF4.build.cmsis_lib_gcc=arm_cortexM4lf_math
GenF4.build.cmsis_lib_gcc=arm_cortexM7lfsp_math
-O2
Code: Select all
##########################################
Single Precision C Whetstone Benchmark
Calibrate
0.15 Seconds 1 Passes (x 100)
0.77 Seconds 5 Passes (x 100)
3.86 Seconds 25 Passes (x 100)
Use 64 passes (x 100)
Single Precision C/C++ Whetstone Benchmark
Loop content Result MFLOPS MOPS Seconds
N1 floating point -1.12475013732910156 48.000 0.026
N2 floating point -1.12274742126464844 33.340 0.258
N3 if then else 1.00000000000000000 94.628 0.070
N4 fixed point 12.00000000000000000 160.001 0.126
N5 sin,cos etc. 0.49909299612045288 0.924 5.764
N6 floating point 0.99999982118606567 31.994 1.079
N7 assignments 3.00000000000000000 41.210 0.287
N8 exp,sqrt etc. 0.75110614299774170 1.052 2.264
MWIPS 64.819 9.874
And specifying a m7 V5 fpu
Code: Select all
GenF4.name=Generic STM32F4 series
GenF4.build.vid=0x0483
GenF4.build.core=arduino
GenF4.build.board=GenF4
GenF4.build.extra_flags=-D{build.product_line} {build.enable_usb} {build.xSerial} {build.bootloader_flags}
#GenF4.build.mcu=cortex-m4
GenF4.build.mcu=cortex-m7
#GenF4.build.flags.fp=-mfpu=fpv4-sp-d16 -mfloat-abi=hard
GenF4.build.flags.fp=-mfpu=fpv5-sp-d16 -mfloat-abi=hard
#GenF4.build.flags.fp=-march=armv7e-m+fp+fpv5+fp.dp -mfloat-abi=hard -mfpu=fpv5-d16 -fsingle-precision-constant
GenF4.build.series=STM32F4xx
#GenF4.build.cmsis_lib_gcc=arm_cortexM4lf_math
GenF4.build.cmsis_lib_gcc=arm_cortexM7lfsp_math
-O2
Code: Select all
##########################################
Single Precision C Whetstone Benchmark
Calibrate
0.15 Seconds 1 Passes (x 100)
0.77 Seconds 5 Passes (x 100)
3.86 Seconds 25 Passes (x 100)
Use 64 passes (x 100)
Single Precision C/C++ Whetstone Benchmark
Loop content Result MFLOPS MOPS Seconds
N1 floating point -1.12475013732910156 48.000 0.026
N2 floating point -1.12274742126464844 33.340 0.258
N3 if then else 1.00000000000000000 96.000 0.069
N4 fixed point 12.00000000000000000 160.001 0.126
N5 sin,cos etc. 0.49909299612045288 0.924 5.764
N6 floating point 0.99999982118606567 31.994 1.079
N7 assignments 3.00000000000000000 41.210 0.287
N8 exp,sqrt etc. 0.75110614299774170 1.051 2.265
MWIPS 64.819 9.874
Seems like the result is the same for the m4 fpu vs the m7 fpu - The m4 runs with both.
Re: dhrystone and whetstone benchmarks
Posted: Sun Jan 26, 2020 7:14 pm
by fpiSTM
Bingo600 wrote: Sun Jan 26, 2020 6:35 pm
You mean this one ??
GenF4.build.cmsis_lib_gcc=arm_cortexM4lf_math
Seems like the F7 uses this one
cmsis_lib_gcc=arm_cortexM7lfsp_math
Is the source available for those math libs ??
/Bingo
Yes, this is the one supporting float and for CMSIS version 5.5.1.
This is provided by ARM:
https://github.com/ARM-software/CMSIS_5 ... SP/Lib/GCC