PlatformIO and Arduino IDE compilation results are different

Post here first, or if you can't find a relevant section!
stevestrong
Posts: 502
Joined: Fri Dec 27, 2019 4:53 pm
Answers: 8
Location: Munich, Germany
Contact:

Re: PlatformIO and Arduino IDE compilation results are different

Post by stevestrong »

Obviously, you are measuring the time to send data to Serial instead of setting bits.
In this case you have to consider the properties of USB serial which works based on periodic interrupts which take place at fastest each millisecond.
That is what your results mirrors, if the Serial USB does not buffer the data.
GonzoG
Posts: 403
Joined: Wed Jan 15, 2020 11:30 am
Answers: 27
Location: Prudnik, Poland

Re: PlatformIO and Arduino IDE compilation results are different

Post by GonzoG »

leonardo wrote: Thu Mar 11, 2021 8:47 am I want to test the execution time of the function "digitalReadFast(digitalPinToPinName(pin))"
Then you need to execute it hundreds, if not thousands of times and divide measured time by number how many times you have executed it.
And avoid using loops (for, while, etc) as those take quite some time.

On F103 (blue pill) I've timed it at about 0.055us (18M times / s).

As to interrupt reaction time, I got about 2.3us on F103. (2.4us delay between input signal and output changed by ISR)
ISR:

Code: Select all

void ISR()
{
  digitalWriteFast(digitalPinToPinName(PA6),GPIOA->IDR & B1000);
}
ozcar
Posts: 143
Joined: Wed Apr 29, 2020 9:07 pm
Answers: 5

Re: PlatformIO and Arduino IDE compilation results are different

Post by ozcar »

GonzoG wrote: Thu Mar 11, 2021 7:37 pm On F103 (blue pill) I've timed it at about 0.055us (18M times / s).
I tried using DWT->CYCCNT instead of micros(). That tells me that on the 72MHz F303, digitalReadFast(digitalPinToPinName(PB10)) takes around 215ns, while digitalReadFast(PB_10) was only 35ns (compiled with -O3). To misquote somebody famous, "35ns ought to be fast enough for anybody".

Edit: I've got a feeling I must have messed something up. Sounds too good (fast) to be true.
ozcar
Posts: 143
Joined: Wed Apr 29, 2020 9:07 pm
Answers: 5

Re: PlatformIO and Arduino IDE compilation results are different

Post by ozcar »

ozcar wrote: Fri Mar 12, 2021 1:02 am Edit: I've got a feeling I must have messed something up. Sounds too good (fast) to be true.
I had a chance to look at this again. I don't know exactly how, but I think the compiler was somehow thwarting my attempt to measure the timing. So, that was indeed too good to be true. Now I measured the time a different way - similar to what was done in another thread here. Still using the 72MHz F303.

With this in setup():

Code: Select all

attachInterrupt(PB10, receiver_ch1, CHANGE);             //Connect changing PB10 to routine receiver_ch1

pinMode(PA0,OUTPUT);                                     // set up two output lines for measuring time
digitalWrite(PA0, LOW);
pinMode(PA1,OUTPUT);
digitalWrite(PA1, LOW);

// use timer pwm to generate test signal - pwm output on PB11 tied to PB10 input to generate the interrupt.
// basically copied from pwm example. 
TIM_TypeDef *Instance = (TIM_TypeDef *)pinmap_peripheral(PB_11, PinMap_PWM);
uint32_t channel = STM_PIN_CHANNEL(pinmap_function(PB_11, PinMap_PWM));

HardwareTimer *MyTim = new HardwareTimer(Instance);

MyTim->setPWM(channel, pinNametoDigitalPin(PB_11), 5, 10); // 5 Hertz, 10% dutycycle
I put this in loop( ):

Code: Select all

 while (1) 
 {
   GPIOA->BSRR = GPIO_BSRR_BS_1;
   GPIOA->BSRR = GPIO_BSRR_BR_1;
 }
That generates pulses around 97ns apart on PA1. By observing those, I can see how long the mainline code goes out-to-lunch when the interrupt occurs (absence of pulses during that time). With a completely empty interrupt routine, the pulses disappear for around 5.7μs.

I then added code to the interrupt routine to flip another GPIO:

Code: Select all

void receiver_ch1()
{
  GPIOA->BSRR = GPIO_BSRR_BS_0;                                // Indicate start of interrupt processing
  // interrupt processing to occur here
  GPIOA->BSRR = GPIO_BSRR_BR_0;                                // Indicate end of interrupt processing
}
That produced a less than 20ns pulse in the middle of the out-to-lunch time (which was near enough still 5.7μs).

Then, with Leonardo's logic in the interrupt routine, but using normal digitalRead():

Code: Select all

void receiver_ch1()
{
  GPIOA->BSRR = GPIO_BSRR_BS_0;                                      // Indicate start of interrupt processing
  uint32_t now = DWT->CYCCNT;                                        //Store the current micros() value
  if ( digitalRead(PB10) ) receiver_input1_previous = now;    //If input PB10 is high start measuring the time
  else receiver_input1 = now - receiver_input1_previous;             //If input PB10 is low calculate the total pulse time
  GPIOA->BSRR = GPIO_BSRR_BR_0;                                      // Indicate end of interrupt processing
}
Result: PA0 pulse 2.08μs ( time spent in receiver_ch1() ), out-to-lunch 7.4μs (total interrupt time).

Then using:

Code: Select all

if ( digitalReadFast(digitalPinToPinName(PB10)) ) receiver_input1_previous = now;
Result: PA0 pulse 1.92μs, out-to-lunch 7.3μs.

Then using:

Code: Select all

if ( digitalReadFast(PB_10) ) receiver_input1_previous = now;
Result : PA0 pulse 1.68μs, out-to-lunch 7.1μs.

Then going full circle, and effectively doing what Leonardo was doing in the first place:

Code: Select all

if ( GPIOB->IDR & GPIO_IDR_10 ) receiver_input1_previous = now; 
Result: PA0 pulse 1.36μs, out-to-lunch 6.7μs

Finally, using DWT->CYCCNT instead of micros():

Code: Select all

void receiver_ch1()
{
  GPIOA->BSRR = GPIO_BSRR_BS_0;                                      // Indicate start of interrupt processing
  uint32_t now = DWT->CYCCNT;                                         // use cycle count instead of micros()
  if ( GPIOB->IDR & GPIO_IDR_10 ) receiver_input1_previous = now;    //If input PB10 is high start measuring the time
  else receiver_input1 = now - receiver_input1_previous;             //If input PB10 is low calculate the total pulse time
  GPIOA->BSRR = GPIO_BSRR_BR_0;                                    // Indicate end of interrupt processing
}
Of course, the cycle count would have to be translated to something useful, but that does not have to be done in the interrupt routine.
Result: PA0 pulse 240ns, out-to-lunch 5.9μs.

This is what that looks like (top trace is PB10/PB11 causing the interrupt, middle trace is PA0 showing the time in receiver_ch1(), and the bottom trace is PA1 showing the "out-to-lunch" gap in activity):

f303time.jpg
f303time.jpg (95.21 KiB) Viewed 2699 times

Given that now only around 4% of the interrupt time is in receiver_ch1(), would probably have to attack or avoid the STM32DUINO and/or HAL code to make any improvement.
ag123
Posts: 1655
Joined: Thu Dec 19, 2019 5:30 am
Answers: 24

Re: PlatformIO and Arduino IDE compilation results are different

Post by ag123 »

one way is to explore dma driven by a timer, that works
dma driven by timer is used in the logic analyzer project here
viewtopic.php?f=10&t=116
then for faster speeds spi may be as fast as can be, limited to 2 bit channels miso, mosi
Post Reply

Return to “General discussion”