Bluepill F4 board, anyone still working on it?
Re: Bluepill F4 board, anyone still working on it?
that 2 vector lanes are inferred in my previous post a few comments above
viewtopic.php?p=725#p725
it is about right given we have mflops that just exceeds the cpu mhz
viewtopic.php?p=725#p725
it is about right given we have mflops that just exceeds the cpu mhz
Re: Bluepill F4 board, anyone still working on it?
But you can't apply ARM11 info to a CM4
/Bingo
/Bingo
Re: Bluepill F4 board, anyone still working on it?
Just did a 96MHz run
Did change rccF4.c (attached), so that it will "autodetect" 8 or 25MHz for the 84/96MHz clock settings.
All you need to set now is the CYCLES_PER_MICROSECOND in the blackpill_f401.h
//#define CYCLES_PER_MICROSECOND 84
#define CYCLES_PER_MICROSECOND 96
/Bingo
Code: Select all
Beginning Whetstone benchmark at 96 MHz ...
0 0 0 1.00 -1.00 -1.00 -1.00
120000 140000 120000 -0.00 0.00 -0.00 0.00
140000 120000 120000 -0.00 0.00 0.00 0.00
3450000 1 1 1.00 -1.00 -1.00 -1.00
2100000 1 2 6.00 6.00 0.00 0.00
320000 1 2 0.00 0.00 0.00 0.00
8990000 1 2 1.00 1.00 1.00 1.00
6160000 1 2 3.00 2.00 3.00 0.00
0 2 3 1.00 -1.00 -1.00 -1.00
930000 2 3 1.00 1.00 1.00 1.00
Loops:10000, Iterations:1, Duration:6984.97 millisec
C Converted Single Precision Whetstones:143.16 Mflops
All you need to set now is the CYCLES_PER_MICROSECOND in the blackpill_f401.h
//#define CYCLES_PER_MICROSECOND 84
#define CYCLES_PER_MICROSECOND 96
/Bingo
- Attachments
-
- rccF4.c.zip
- (3.57 KiB) Downloaded 986 times
Re: Bluepill F4 board, anyone still working on it?
yay, it is now closer to 150 Mflops and certainly beat that old P4
but well i think you are right about cm4 not arm 11
i read a little more it seem the 2 co-processors fpu1 is for single precision, fpu 2 is double precision
the reasons for why we get 150 Mflops for just below 100 Mhz is probably ARM's and ST's trade secrets
but i'm just happy if after all we get 150 Mflops on our little f4 pill boards
btw if this is true, we'd just bundle 8 f4 pill boards and that is 1.2 gflops - single precision though
if we overclock, we'd just get more flops for free
i'm not sure when would the bitcoin miners assemble a whole warehouse of f4 pills to try to beat every one's else's hashrates
maybe we should invent a new cryptocurrency that requires an adc, then all of a sudden f4 pills mining become the rage

but well i think you are right about cm4 not arm 11
i read a little more it seem the 2 co-processors fpu1 is for single precision, fpu 2 is double precision
the reasons for why we get 150 Mflops for just below 100 Mhz is probably ARM's and ST's trade secrets
but i'm just happy if after all we get 150 Mflops on our little f4 pill boards
btw if this is true, we'd just bundle 8 f4 pill boards and that is 1.2 gflops - single precision though
if we overclock, we'd just get more flops for free
i'm not sure when would the bitcoin miners assemble a whole warehouse of f4 pills to try to beat every one's else's hashrates
maybe we should invent a new cryptocurrency that requires an adc, then all of a sudden f4 pills mining become the rage

Re: Bluepill F4 board, anyone still working on it?
@ag123
From old forum
https://mcu.selfip.com/viewtopic.php?f= ... t=20#p1678
When soldering the PSRAM , did you populate the 100nF cap ?
What about the "suggested" 10K pullup on /CS ?
I have 10 x LY68L6400SLIT lying around

/Bingo
From old forum
https://mcu.selfip.com/viewtopic.php?f= ... t=20#p1678
When soldering the PSRAM , did you populate the 100nF cap ?
What about the "suggested" 10K pullup on /CS ?
I have 10 x LY68L6400SLIT lying around


/Bingo
Re: Bluepill F4 board, anyone still working on it?
Guys, I wouldn't want to spoil your happiness with the 401/411 boards..
1. there is only 1 FPU on the chip, single precision
2. the MFLOPs we get are numbers which are pretty off the real life, imho.
The Whetstone benchmark has to be reviewed and put into sync with results you may find on some official sites.
It seems the result has to be scaled down somehow.
You cannot get 150Mflops with a 96MHz clocked Cortex M4F.
The FPU does a single precision math calc in 1-14clock cycles. Load/Store a floating point number takes at least 2 clocks.
There is
1. an benchmark code overhead,
2. an C overhead,
3. and you cannot do 2 asm instructions per clock with 401/411.
I would suggest to have a closer look at the Whetstone benchmark, do a research on the results with similar architectures.

1. there is only 1 FPU on the chip, single precision
2. the MFLOPs we get are numbers which are pretty off the real life, imho.
The Whetstone benchmark has to be reviewed and put into sync with results you may find on some official sites.
It seems the result has to be scaled down somehow.
You cannot get 150Mflops with a 96MHz clocked Cortex M4F.
The FPU does a single precision math calc in 1-14clock cycles. Load/Store a floating point number takes at least 2 clocks.
There is
1. an benchmark code overhead,
2. an C overhead,
3. and you cannot do 2 asm instructions per clock with 401/411.
I would suggest to have a closer look at the Whetstone benchmark, do a research on the results with similar architectures.
Pukao Hats Cleaning Services Ltd.
Re: Bluepill F4 board, anyone still working on it?
For example - here we did Whetstone with 80MHz pic32, double precision, no FPU, running off ram
http://retrobsd.org/viewtopic.php?f=6&t=4065
1.2Mflops DP.
With single and FPU it will not be more that 25Mflops. With M4F @96Mhz -> 30Mflops max.
Here
http://www.roylongbottom.org.uk/Raspber ... hmarks.htm
RaspberryPi @700MHz single precision ~100Mflops.
I think our Whetsone version is somehow different from those above, ie. different result table, and we do not do the calibration at the beginning. Mflops are valid only for 3 measurements, the MWIPS is then calculated off those Mflops.
Also it seems to me we mess with MWIPS instead of Mflops.
.
http://retrobsd.org/viewtopic.php?f=6&t=4065
1.2Mflops DP.
With single and FPU it will not be more that 25Mflops. With M4F @96Mhz -> 30Mflops max.
Here
http://www.roylongbottom.org.uk/Raspber ... hmarks.htm
RaspberryPi @700MHz single precision ~100Mflops.
I think our Whetsone version is somehow different from those above, ie. different result table, and we do not do the calibration at the beginning. Mflops are valid only for 3 measurements, the MWIPS is then calculated off those Mflops.
Also it seems to me we mess with MWIPS instead of Mflops.
.
Pukao Hats Cleaning Services Ltd.
Re: Bluepill F4 board, anyone still working on it?
And here is the Whetstone.c used above - it has to be ported to stm32duino..
- Attachments
-
- whets.zip
- (5.73 KiB) Downloaded 982 times
Pukao Hats Cleaning Services Ltd.
Re: Bluepill F4 board, anyone still working on it?
I never said i believed the resultsBingo600 wrote: Sun Jan 19, 2020 9:04 pm
Edit: I'm enclined to agree with pito ... Our numbers are/could be - fishy ... If we beat a 2GHz P4
/Bingo

/Bingo
This is my Laptop
Code: Select all
./whetstoneIL
##########################################
Single Precision C Whetstone Benchmark vfpv4 32 Bit, Fri Jan 24 23:59:17 2020
Calibrate
0.01 Seconds 1 Passes (x 100)
0.02 Seconds 5 Passes (x 100)
0.06 Seconds 25 Passes (x 100)
0.30 Seconds 125 Passes (x 100)
1.46 Seconds 625 Passes (x 100)
7.39 Seconds 3125 Passes (x 100)
Use 4227 passes (x 100)
From File /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 58
model name : Intel(R) Core(TM) i5-3320M CPU @ 2.60GHz
stepping : 9
microcode : 0x21
cpu MHz : 1197.221
cache size : 3072 KB
physical id : 0
siblings : 4
core id : 0
cpu cores : 2
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm cpuid_fault epb pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt dtherm ida arat pln pts md_clear flush_l1d
bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit
bogomips : 5188.06
clflush size : 64
cache_alignmeLinux version 5.3.0-26-generic (buildd@lgw01-amd64-039) (gcc version 7.4.0 (Ubuntu 7.4.0-1ubuntu1~18.04.1)) #28~18.04.1-Ubuntu SMP Wed Dec 18 16:40:14 UTC 2019
From File /proc/version
Linux version 5.3.0-26-generic (buildd@lgw01-amd64-039) (gcc version 7.4.0 (Ubuntu 7.4.0-1ubuntu1~18.04.1)) #28~18.04.1-Ubuntu SMP Wed Dec 18 16:40:14 UTC 2019
Single Precision C/C++ Whetstone Benchmark
Loop content Result MFLOPS MOPS Seconds
N1 floating point -1.12475013732910156 1091.859 0.074
N2 floating point -1.12274742126464844 1089.598 0.521
N3 if then else 1.00000000000000000 9512.771 0.046
N4 fixed point 12.00000000000000000 4826.982 0.276
N5 sin,cos etc. 0.49911010265350342 106.989 3.287
N6 floating point 0.99999982118606567 804.075 2.836
N7 assignments 3.00000000000000000 3216.551 0.243
N8 exp,sqrt etc. 0.75110864639282227 58.687 2.679
MWIPS 4242.902 9.963
A new results file, whets.txt, will have been created in the same
directory as the .EXE files, if one did not already exist.
Type additional information to include in whets.txt - Press Enter
T430
Re: Bluepill F4 board, anyone still working on it?
Pukao Hats Cleaning Services Ltd.