Painfully slow SPI on STM32F405

Post here first, or if you can't find a relevant section!
tyguy2
Posts: 5
Joined: Thu Jul 30, 2020 7:11 pm

Painfully slow SPI on STM32F405

Post by tyguy2 »

Hello all!

I've been working on building a project based on the Adafruit STM32F405 Feather Express. My main goal is to be able to control an 8 channel DAC, with each channel being able to put out at maximum a 4 kHz sawtooth wave independently of each other (i.e. channel 1 outputs a 255 Hz sawtooth, channel 2 outputs a 4 kHz sawtooth, etc etc). On paper, this should work, as the DAC I've selected, the LTC2636, can handle SPI bus speeds up to 50 MHz. Even if the SPI bus runs at a slower speed, I should have ample headroom to do this. The STM32F405 runs at 168 MHz, and even if the SPI bus ran at a fraction of that speed, it would still be well within 50 Mhz.

Based on this line of reasoning, I built up a PCB based on the Adafruit STM32F405 schematics (Seen here), with the only alteration being that the SPI bus going to the SPI flash instead goes to the LT2636. After a few edits to the Feather's variant.h files (changing the default SPI bus the SPI library uses to SPI1 instead of SPI2) for Arduino, I had the SPI bus communicating with the DAC. To my surprise, however, the communication speed was ***painfully*** slow. Running the code I have below, I was only able to muster 1.69 kHZ on 1 channel. At first I thought the digitalWrite function was slowing me down, but even with direct port manipulation (see other code below, I replaced digitalWrite with the sections labelled HIGH and LOW), I was still bottle-necked, with no improvement to my transfer speed. Based on readings from my (unfortunately analog) oscilloscope, as well as some tinkering with the clock speed on the code, I *think* the bus is being limited to 12 Mhz, but it could be much less, it's hard to tell with the instruments I currently have. Based on my reading of the datasheet for the STM32405, SPI1 is limited to 42 Mbits/s, which would have a clock speed well over 12 Mhz.

I don't know why this is happening, does anyone know how I can increase the transfer speed of the SPI bus? I would seriously appreciate it!

Code: Select all

#include <SPI.h>

#define CS 7

int i = 0;

void setup() {
  pinMode(CS, OUTPUT);
  digitalWrite(CS, HIGH); 
  SPI.begin();
}

void loop() {
  i++;

  if (i > 1023){
    i = 0;
  }
  
  dac8Write(1,i);

}

void dac8Write(byte channel, int input) {
  //Warning: Channel must be less than 7!
  
  //0011 Command - Write to and Update DAC
  //0xxx Address - Binary address of DAC's 1-8
  //xxxxxxxxxx   - 10 bit analog value
  
  channel = 0x30 | channel;
  input = input << 6;
  
  // Send as command, address, then 10 bit value

  SPI.beginTransaction(SPISettings(50000000, MSBFIRST, SPI_MODE0));
  digitalWrite(CS, LOW);
  
  SPI.transfer(channel);
  SPI.transfer(highByte(input));
  SPI.transfer(lowByte(input));
  digitalWrite(CS, HIGH);
    
  SPI.endTransaction();
  }

Code: Select all

void setup() {
  // put your setup code here, to run once:
  pinMode(PA15, OUTPUT);
  //PA15 -> Arduino Pin 7
}

void loop() {
//(HIGH)
GPIOA->BSRR = 0b1000000000000000;
//(LOW)
GPIOA->BSRR = 0b1000000000000000 << 16;

}
User avatar
Bakisha
Posts: 140
Joined: Fri Dec 20, 2019 6:50 pm
Answers: 5
Contact:

Re: Painfully slow SPI on STM32F405

Post by Bakisha »

I once measured speed of SPI.transfer() function, and it have around 3uS overhead between sending bytes (at whatever speed).

I end up using macro (that i found somewhere on internet) , if it is helpfull for you:

Code: Select all

void  SPI_TRANSFER (uint8_t x)  { // 0.3uS overhead between bytes when optimized // 3.25 uS unoptimized
 //SPI.transfer(x);  // send byte
  SPI1->DR = (x) ;
  while (!(SPI1->SR & SPI_SR_TXE) );  // wait untill send buffer is empty
};
tyguy2
Posts: 5
Joined: Thu Jul 30, 2020 7:11 pm

Re: Painfully slow SPI on STM32F405

Post by tyguy2 »

So would this act as a direct replacement to SPI.transfer? I'm guessing I would still need to drive the Chip Select pin using either port manipulation or digitalWrite, correct? What about the SPIsettings I used previously? Would my code look like this now:

Code: Select all

#include <SPI.h>

#define CS 7

int i = 0;

void setup() {
  pinMode(CS, OUTPUT);
  digitalWrite(CS, HIGH); 
  SPI.begin();
}

void loop() {
  i++;

  if (i > 1023){
    i = 0;
  }
  
  dac8Write(1,i);

}


void dac8Write(byte channel, int input) {
  //Warning: Channel must be less than 7!
  
  //0011 Command - Write to and Update DAC
  //0xxx Address - Binary address of DAC's 1-8
  //xxxxxxxxxx   - 10 bit analog value
  
  channel = 0x30 | channel;
  input = input << 6;
  
  // Send as command, address, then 10 bit value

  SPI.beginTransaction(SPISettings(50000000, MSBFIRST, SPI_MODE0));
  digitalWrite(CS, LOW);
  
  SPI_TRANSFER(channel);
  SPI_TRANSFER(highByte(input));
  SPI_TRANSFER(lowByte(input));
  digitalWrite(CS, HIGH);
    
  SPI.endTransaction();
  }
  
  void  SPI_TRANSFER (uint8_t x)  { // 0.3uS overhead between bytes when optimized // 3.25 uS unoptimized
 //SPI.transfer(x);  // send byte
  SPI1->DR = (x) ;
  while (!(SPI1->SR & SPI_SR_TXE) );  // wait untill send buffer is empty
}
EDIT: Tried the above code, didn't work. What am I doing wrong?
User avatar
Bakisha
Posts: 140
Joined: Fri Dec 20, 2019 6:50 pm
Answers: 5
Contact:

Re: Painfully slow SPI on STM32F405

Post by Bakisha »

tyguy2 wrote: Fri Jul 31, 2020 5:36 am
EDIT: Tried the above code, didn't work. What am I doing wrong?
Well, I can't say what is wrong, that code worked for me. Only difference is that I called SPI.beginTransaction after SPI.begin in setup and I never used SPI.endTransaction. I use it to drive some 595 shift registers with stm32f103c8. It was only device on SPI bus, and, yes, I was driving CE signal manually.
Maybe to try with lower SPI speed?
Does your DAC device return some data? I think there is similar code to read SPI data on Internet...
tyguy2
Posts: 5
Joined: Thu Jul 30, 2020 7:11 pm

Re: Painfully slow SPI on STM32F405

Post by tyguy2 »

I'll do some more tinkering to see if that gets it working, but I think there's a bigger problem to tackle, and that's the bus speed. I did more probing, and found that the CS line on the chip is low for 5.27 microseconds. Since 24 bits are being sent over the bus in this 5.27 microseconds, I can take (5.27*10^-6)seconds/24 bits = 2.19583333e-7 seconds/ bit. Taking the inverse of that we find the bus is operating at around 4 Mhz. I noticed in the SPI library C files the default bus speed is 4Mhz, so sounds like it's not using my specified speed. Any ideas?
User avatar
Bakisha
Posts: 140
Joined: Fri Dec 20, 2019 6:50 pm
Answers: 5
Contact:

Re: Painfully slow SPI on STM32F405

Post by Bakisha »

Try older commands to set settings:

Code: Select all

SPI.begin();
SPI.setBitOrder(MSBFIRST); SPI.setDataMode(SPI_MODE0); SPI.setClockDivider(SPI_CLOCK_DIV2);
Without Spisettiings/beginTransaction stuff...

Maybe you measured time to transfer 3 bytes, plus overhead time... See if it's same time (doubled) if you try to send 6 bytes.
dannyf
Posts: 447
Joined: Sat Jul 04, 2020 7:46 pm

Re: Painfully slow SPI on STM32F405

Post by dannyf »

once you the module set up, you only need to load up with the data to be transferred.

two things to speed up the transfer:
1) use interrupt / dma;
2) test the status first to ensure that the module isn't transferring now. this avoids collision.
tyguy2
Posts: 5
Joined: Thu Jul 30, 2020 7:11 pm

Re: Painfully slow SPI on STM32F405

Post by tyguy2 »

Bakisha wrote: Fri Jul 31, 2020 6:50 am
tyguy2 wrote: Fri Jul 31, 2020 5:36 am
EDIT: Tried the above code, didn't work. What am I doing wrong?
Well, I can't say what is wrong, that code worked for me. Only difference is that I called SPI.beginTransaction after SPI.begin in setup and I never used SPI.endTransaction. I use it to drive some 595 shift registers with stm32f103c8. It was only device on SPI bus, and, yes, I was driving CE signal manually.
Maybe to try with lower SPI speed?
Does your DAC device return some data? I think there is similar code to read SPI data on Internet...
Looked on the scope, it looks like it gets stuck in the while (!(SPI1->SR & SPI_SR_TXE) ); section of the code, as the CS line never goes high. What could be causing that? I'm guessing the SPI buffer never returns that it's empty?
dannyf wrote: Fri Jul 31, 2020 11:57 am once you the module set up, you only need to load up with the data to be transferred.

two things to speed up the transfer:
1) use interrupt / dma;
2) test the status first to ensure that the module isn't transferring now. this avoids collision.
How would I use DMA? I'm not the most experienced in lower level coding, unfortunately, and need all the help I can get.
User avatar
Bakisha
Posts: 140
Joined: Fri Dec 20, 2019 6:50 pm
Answers: 5
Contact:

Re: Painfully slow SPI on STM32F405

Post by Bakisha »

I finally found solution. After dive into CMSIS definitions of the core, and testing it on logic analyzer with STM32F401CC (84MHz) , i found that error was on my part, because in my case, i had CE not going high after transfer (i had some more stuff execution) , and in this case CE will go high while hardware is still sending last byte. So, here is a macro for sending a byte:

Code: Select all

void  SPI_TRANSFER (uint8_t x)  {
  // SPI.transfer(x);  // send byte
  SPI1->DR = (x) ;
  while ((SPI1->SR &  SPI_SR_BSY ) | (SPI1->SR & (!SPI_SR_TXE)) ) {}; // wait until - Transmit buffer NOT Empty - Busy flag  SET
}
But, i was correct that gaps between bytes normal SPI.transfer was around 2uS.
Also, i found that 17uS was just for SPI.beginTransaction and SPI.endTransaction
Also, digitalWrite is around 0.5uS

Here is sketch that used to get maximum speed :

Code: Select all

#include <SPI.h>
#define CS PA4
int i = 0;
void setup() {
  pinMode(CS, OUTPUT);
  digitalWrite(CS, HIGH);
  SPI.begin();
  SPI.beginTransaction(SPISettings(50000000, MSBFIRST, SPI_MODE0));
}
void loop() {
  i++;

  if (i > 1023) {
    i = 0;
  }

  dac8Write(1, i);
}

void dac8Write(byte channel, int input) {
  //Warning: Channel must be less than 7!
  //0011 Command - Write to and Update DAC
  //0xxx Address - Binary address of DAC's 1-8
  //xxxxxxxxxx   - 10 bit analog value
  channel = 0x30 | channel;
  input = input << 6;
  // Send as command, address, then 10 bit value

  digitalWrite(CS, LOW);
  SPI_TRANSFER(channel);
  SPI_TRANSFER(highByte(input));
  SPI_TRANSFER(lowByte(input));
  digitalWrite(CS, HIGH);

}
void  SPI_TRANSFER (uint8_t x)  { // 0.3uS overhead between bytes when optimized // 3.25 uS unoptimized
  // SPI.transfer(x);  // send byte
  SPI1->DR = (x) ;
  while ((SPI1->SR &  SPI_SR_BSY ) | (SPI1->SR & (!SPI_SR_TXE)) ) {}; // wait until - Transmit buffer NOT Empty - Busy flag  SET
}
And here is screenshot of logic analyzer of that sketch (ignore CLK and MOSI, 24MHz samplerate is to slow for 42MHz spi clock, but is enough for CE time) :
Clipboard010.jpg
Clipboard010.jpg (62.99 KiB) Viewed 6616 times
I hope that is only device on SPI1 bus, or things might get complicated.
dannyf
Posts: 447
Joined: Sat Jul 04, 2020 7:46 pm

Re: Painfully slow SPI on STM32F405

Post by dannyf »

some suggestions - based on my quick scan of your code.

1) I would rewrite the look at this:

Code: Select all

  dac8Write(1, ++i & 0x03ff);
2) check the spi's status first and then load the data. this way, your execution can continue while the spi module continues to transfer data.

I don't have a spi piece but you can how this approach is implemented here https://github.com/dannyf00/My-MCU-Libr ... F0/uart1.c, in uart1_put.c:

Code: Select all

//uart1 send a char
void uart1_put(char dat) {
    //while (!(UARTx->SR & USART_SR_TXE));    	//wait for the transmission buffer to be empty
    while (uart1_busy()) continue;    			//wait for the transmission buffer to be empty
    UARTx->TDR = dat;                        	//load the data buffer
    //while (!(UARTx->SR & (1<<6)));    		//wait for the transmission to complete
}
the concept should apply here. for speed, you can convert it a macro as well. this approach can achieve very fast transmission speed, without resorting to interrupt, on chips with a big transmission buffer (TM4C for example).
Post Reply

Return to “General discussion”