DMA for SPI

Related to the the forum.
ag123
Posts: 1881
Joined: Thu Dec 19, 2019 5:30 am
Answers: 30

DMA for SPI

Post by ag123 »

I'm thinking about gradually introducing DMA Into SPI, but for now I too have little bandwidth to work on that.

I did some experiments, by using STM32 Cube IDE to generate SPI codes for various socs, I tried with the following socs
(only code generation not tried actual tests with devices yet):
- stm32f103cb (c8 should be quite similar I'd guess)
- stm32f401cc
- stm32g431cb
- stm32h562rg
- stm32h723vg
and even
- stm32g030f6p6

it turns out that SPI with DMA is possible (at least the codes are generated) for the above socs.

However, I realized big stumbling blocks while examining the codes and various ref manuals.
To do DMA, the following tasks are necessary

- at initialization to configure DMA for SPI
(this is the biggest stumbling block)
the DMA peripherals on every series (possibly within series as well?) is different
and the worse thing is:
the DMA channels and/or streams (and possibly other designs) of every SPI (including within series) is different
e.g. spi1, spi2, spi3 etc are bound to different channels / streams and there are limited DMA resources (i.e limited channels, stream etc)
and a possible channel/stream conflict use is high
in this terms, every chip (every series and every SPI ) would need a different configuration
(i.e. different even on a same chip for different spi)
due to the high risks of DMA usage conflicts, SPI with DMA may not be possible 'everywhere'.

- configuration of SPI (this is the usual SPI settings baud rates, polarity, phase, etc), but that there is at least some differences between the series
some of the SPI peripherals can be driven by clocks other than PCLK e.g. on the H7 series (this is 'lower level' than just SPI in RCC)
e.g. it may cause some troubles with baud rate calcs.
In addition, to enable DMA in the SPI peripheral register settings.

- configuration of interrupts for DMA half_complete, complete interrupts
Apparently the call
HAL_SPI_TransmitReceive_DMA(&hspi1, TX_Buffer, RX_Buffer, SPI_DMA_BUF_SIZE);
is actually similar/same between the series and different soc, but:
- it actually uses the DMA_complete interrupt (it registers a callback) to handle DMA completion
- this means that it is actually *async*, i.e. it returns even before the DMA transfer is complete.
- due to the use of the DMA complete interrupts, the interrupt hooks needs to be configured for the *correct DMA channel/stream* and enabled.
different between series and different spi pheripherial (i.e. different even on a same chip for different spi)

I'm thinking then that the SPI class with DMA would be better implemented in its own separate base class.
Due to very different initializations between different series (even within series) and different between every different SPI pheripherial (e.g. spi1, spi2, etc are different) and different behaviour (e.g. async)
Some ways to do that however, could include that the SPI.transfer() functions may need to be declared virtual in the original base class the .h file.

It is also unlikely that DMA be provided in default (given that 'everything is different' with possible conflicts as above), but that it could be implemented as 'examples' and using the DMA base class as a template, and that individuals take that forward by overriding the methods in the DMA base class and do their own DMA, SPI and interrupts (NVIC) initializations for their own SOC.
ag123
Posts: 1881
Joined: Thu Dec 19, 2019 5:30 am
Answers: 30

Re: DMA for SPI

Post by ag123 »

I tried using Github copilot to propose some codes, it proposed something like this:
note that this is copied verbatim, no (human) checks are done if it is after all correct.

Code: Select all

#include "stm32f4xx_hal.h"

class SPI_DMA {
public:
    SPI_DMA(SPI_HandleTypeDef* spiHandle, DMA_HandleTypeDef* dmaTxHandle, DMA_HandleTypeDef* dmaRxHandle)
        : hspi(spiHandle), hdma_tx(dmaTxHandle), hdma_rx(dmaRxHandle) {}

    void init() {
        // Initialize the SPI peripheral
        initSPI();

        // Initialize the DMA for TX
        initDMA();

        // Initialize the NVIC for DMA
        initNVIC();
    }

    void transmit(uint8_t* data, uint16_t size) {
        HAL_SPI_Transmit_DMA(hspi, data, size);
    }

    void receive(uint8_t* data, uint16_t size) {
        HAL_SPI_Receive_DMA(hspi, data, size);
    }

    void transmitReceive(uint8_t* txData, uint8_t* rxData, uint16_t size) {
        HAL_SPI_TransmitReceive_DMA(hspi, txData, rxData, size);
    }

private:
    SPI_HandleTypeDef* hspi;
    DMA_HandleTypeDef* hdma_tx;
    DMA_HandleTypeDef* hdma_rx;

    void initSPI() {
        hspi->Instance = SPI1;
        hspi->Init.Mode = SPI_MODE_MASTER;
        hspi->Init.Direction = SPI_DIRECTION_2LINES;
        hspi->Init.DataSize = SPI_DATASIZE_8BIT;
        hspi->Init.CLKPolarity = SPI_POLARITY_LOW;
        hspi->Init.CLKPhase = SPI_PHASE_1EDGE;
        hspi->Init.NSS = SPI_NSS_SOFT;
        hspi->Init.BaudRatePrescaler = SPI_BAUDRATEPRESCALER_16;
        hspi->Init.FirstBit = SPI_FIRSTBIT_MSB;
        hspi->Init.TIMode = SPI_TIMODE_DISABLE;
        hspi->Init.CRCCalculation = SPI_CRCCALCULATION_DISABLE;
        hspi->Init.CRCPolynomial = 10;

        if (HAL_SPI_Init(hspi) != HAL_OK) {
            // Initialization Error
            Error_Handler();
        }
    }

    void initDMA() {
        // Initialize DMA for TX
        hdma_tx->Instance = DMA2_Stream3;
        hdma_tx->Init.Channel = DMA_CHANNEL_3;
        hdma_tx->Init.Direction = DMA_MEMORY_TO_PERIPH;
        hdma_tx->Init.PeriphInc = DMA_PINC_DISABLE;
        hdma_tx->Init.MemInc = DMA_MINC_ENABLE;
        hdma_tx->Init.PeriphDataAlignment = DMA_PDATAALIGN_BYTE;
        hdma_tx->Init.MemDataAlignment = DMA_MDATAALIGN_BYTE;
        hdma_tx->Init.Mode = DMA_NORMAL;
        hdma_tx->Init.Priority = DMA_PRIORITY_LOW;
        hdma_tx->Init.FIFOMode = DMA_FIFOMODE_DISABLE;

        if (HAL_DMA_Init(hdma_tx) != HAL_OK) {
            // Initialization Error
            Error_Handler();
        }

        __HAL_LINKDMA(hspi, hdmatx, *hdma_tx);

        // Initialize DMA for RX
        hdma_rx->Instance = DMA2_Stream0;
        hdma_rx->Init.Channel = DMA_CHANNEL_3;
        hdma_rx->Init.Direction = DMA_PERIPH_TO_MEMORY;
        hdma_rx->Init.PeriphInc = DMA_PINC_DISABLE;
        hdma_rx->Init.MemInc = DMA_MINC_ENABLE;
        hdma_rx->Init.PeriphDataAlignment = DMA_PDATAALIGN_BYTE;
        hdma_rx->Init.MemDataAlignment = DMA_MDATAALIGN_BYTE;
        hdma_rx->Init.Mode = DMA_NORMAL;
        hdma_rx->Init.Priority = DMA_PRIORITY_HIGH;
        hdma_rx->Init.FIFOMode = DMA_FIFOMODE_DISABLE;

        if (HAL_DMA_Init(hdma_rx) != HAL_OK) {
            // Initialization Error
            Error_Handler();
        }

        __HAL_LINKDMA(hspi, hdmarx, *hdma_rx);
    }

    void initNVIC() {
        // Configure NVIC for DMA
        HAL_NVIC_SetPriority(DMA2_Stream3_IRQn, 0, 0);
        HAL_NVIC_EnableIRQ(DMA2_Stream3_IRQn);

        HAL_NVIC_SetPriority(DMA2_Stream0_IRQn, 0, 0);
        HAL_NVIC_EnableIRQ(DMA2_Stream0_IRQn);
    }

    void Error_Handler() {
        // User can add their own implementation to report the HAL error return state
        while(1) {
        }
    }
};

// Example usage
SPI_HandleTypeDef hspi1;
DMA_HandleTypeDef hdma_spi1_tx;
DMA_HandleTypeDef hdma_spi1_rx;

SPI_DMA spi1(&hspi1, &hdma_spi1_tx, &hdma_spi1_rx);

void SystemClock_Config();

int main() {
    // Initialize the HAL Library
    HAL_Init();

    // Configure the system clock
    SystemClock_Config();

    // Initialize SPI, DMA, and NVIC
    spi1.init();

    // Transmit and receive data
    uint8_t txData[] = {0x01, 0x02, 0x03};
    uint8_t rxData[3];
    spi1.transmitReceive(txData, rxData, sizeof(txData));

    while (1) {
        // Main loop
    }
}
fpiSTM
Posts: 1919
Joined: Wed Dec 11, 2019 7:11 pm
Answers: 107
Location: Le Mans
Contact:

Re: DMA for SPI

Post by fpiSTM »

Hi @ag123

I've already think about DMA support. First thing I'd like to provide is a way to retrieve DMA information per series/mcu thanks the Cube Mx data base.
Already used to generate several files.
https://github.com/STMicroelectronics/S ... 54C1-L55C1

The DMA-STM32F103G_dma_v1_0_Modes.xml is for example not available with this repo but available in the Cube Mx data base installed with STM32CubeMX.

And can found such information:

Code: Select all

<Mode Name="DMA1_Channel5">
                    <ModeLogicOperator Name="XOR">
                        <Mode Name="MEMTOMEM"/>
                        <Mode Name="SPI2_TX">
                            <Condition Diagnostic="" Expression="SPI2_DmaTransmit|I2S2_TX"/>
                        </Mode>
                        <Mode Name="USART1_RX">
                            <Condition Diagnostic="" Expression="S_USART1_TX_RX|S_USART1_RX"/>
                        </Mode>
                        <Mode Name="I2C2_RX">
                            <Condition Diagnostic="" Expression="I2C2_Dma"/>
                        </Mode>
                        <Mode Name="TIM1_UP"/>
                        <Mode Name="TIM2_CH1">
                            <Condition Diagnostic="" Expression="Semaphore_CaptureCompare_1_DMA_EnableTIM2"/>
                        </Mode>
                        <Mode Name="TIM4_CH3">
                            <Condition Diagnostic="" Expression="Semaphore_CaptureCompare_3_DMA_EnableTIM4"/>
                        </Mode>
                    </ModeLogicOperator>
                </Mode>
And here a first tentative done several years ago:
https://github.com/stm32duino/Arduino_C ... 2/pull/825
ag123
Posts: 1881
Joined: Thu Dec 19, 2019 5:30 am
Answers: 30

Re: DMA for SPI

Post by ag123 »

hi @fpiSTM, thanks !

I've some initial / pre-alpha, scratch, untested codes
https://github.com/ag88/stm32duino_spi_dma/tree/main

An intro to some of my thoughts:
In order to create derived classes from SPI.h and to allow them to be referenced from SPIClass many of the methods need to be declared virtual.
https://github.com/ag88/stm32duino_spi_ ... /SPI/SPI.h

This is so that such codes would work.

Code: Select all

SPIClass spi = SPI_DMA(init_parameters);

#define BUFSIZE 256
uint8_t rx_buf{BUFSIZE];
uint8_t tx_buf{BUFSIZE];

void setup() {
	spi.begin();
	memset(tx_buf,0); 
};

void loop() {
	spi.transfer(tx_buf, rx_buf, BUFSIZE);
	// copy received stuff back to tx buff
	memcpy(tx_buf, rx_buf, BUFSIZE);
}
The key is spi is not SPIClass but a derived class from SPIClass. Only if methods in the original base/parent class is declared virtual can they be override by the derived class SPI_DMA and called as if it is SPIClass.

Using derived classes this way would consume flash and memory, but that this would make the codes very modular specific to each permutation series, soc, pheripherial, pins, or even board rather than having a lot of ifdefs or if-else.
'saving memory and flash' could be looked into later for cramped packed socs with little ram/flash but still wants to play in the big league.
(I think for that we may resort to templates or a simpler class just for them)
---
I've to a large extent refactored all the SPIClass codes so that they practically live in the SPI_DMA class.
https://github.com/ag88/stm32duino_spi_ ... I/SPIDMA.h
https://github.com/ag88/stm32duino_spi_ ... SPIDMA.cpp

I introduced a new method init() this is intended to be called by begin().
init() in turns calls
virtual void initSPI();
virtual void initDMA();
virtual void initNVIC();
virtual void initPINS();

in fact, SPI_DMA class is intended to be the most common codes possibly all virtual functions/methods so that the implementing derived classes can override them to implement that for every series, specific soc (or soc groups), every pheripheral (e.g. spi1, spi2, spi3, etc) (or even board).
the SPI_DMA class may possibly be empty e.g. all virtual functions unless in the base SPI_DMA class there are things which are common which make sense to put it that so that derived classes can call them.

the idea is that so that each initSPI(), initDMA(), initNVIC(), initPINS() can all be different for every difference in DMA hardware (streams, channels etc), SPI hardware, PINs differences etc and derived classes implement them.
---
even
virtual uint32_t getClkFreq(spi_t *obj);
needs to be a virtual function so that derived classes can override that and do their own implementation.
Because in sth32h7xx, the SPI clock can be driven from elsewhere other than PCLK (peripheral clock), I ran into this 'headache' testing out a h7 board as I 'discovered' that the complicated clock configuration create new scenarios that cannot be pre-planned. i.e. in the h7 sysclock can run independently of the peripheral's clocks !

---
There are many things that I've no idea how to go about implementing them, e.g. as there are differences in the DMA hardware between series (channels and streams) different channel, streams for spi and possible conflicts etc!, and the differences multiply further with hardware differences in SPI, the SPI port (e.g. SPI1, SPI2 etc has different symbols in HAL), different pin maps for each SPI. Hence, the thinking about using derived classes to separate them. This also partly alleviates the if-defs and if-else.
fpiSTM
Posts: 1919
Joined: Wed Dec 11, 2019 7:11 pm
Answers: 107
Location: Le Mans
Contact:

Re: DMA for SPI

Post by fpiSTM »

ag123 wrote: Fri Jan 10, 2025 12:09 pm hi @fpiSTM, thanks !
welcome
ag123 wrote: Fri Jan 10, 2025 12:09 pm The key is spi is not SPIClass but a derived class from SPIClass. Only if methods in the original base/parent class is declared virtual can they be override by the derived class SPI_DMA and called as if it is SPIClass.

Using derived classes this way would consume flash and memory, but that this would make the codes very modular specific to each permutation series, soc, pheripherial, pins, or even board rather than having a lot of ifdefs or if-else.
'saving memory and flash' could be looked into later for cramped packed socs with little ram/flash but still wants to play in the big league.
(I think for that we may resort to templates or a simpler class just for them)
Well, I guess the move to ArduinoCore-API will resolve this issue:
https://github.com/arduino/ArduinoCore- ... dwareSPI.h

I've already started (as a background task) to integrate it but requires some more works and also some modifications that I hope be able to integrate to the official repo.
ag123
Posts: 1881
Joined: Thu Dec 19, 2019 5:30 am
Answers: 30

Re: DMA for SPI

Post by ag123 »

@fpiSTM thanks, do take a look at my implementation which is currently incomplete as well

some of the things I thought about is to separate out and make the initializations of specific pieces SPI, DMA, NVIC, pins,
and spi frequency -> divider calculations virtual as well so that those implementations can be overridden.
https://github.com/ag88/stm32duino_spi_ ... I/SPIDMA.h
https://github.com/ag88/stm32duino_spi_ ... SPIDMA.cpp
e.g. begin() calls -> init() calls -> initSPI(), initDMA(), initNVIC(), initPins()

you can use my codes, no issues, I'd not do any PR for now as I'm also 'playing' with my codes.
and that different edits may instead make more work for you as we could be editing the same code segments.

that ArduinoCore-API
https://github.com/arduino/ArduinoCore- ... dwareSPI.h
is still missing
https://github.com/stm32duino/Arduino_C ... C1-L143C67

Code: Select all

    /* Expand SPI API
     * https://github.com/arduino/ArduinoCore-API/discussions/189
     */
    void transfer(const void *tx_buf, void *rx_buf, size_t count);
they need to add it
https://github.com/arduino/ArduinoCore-API/issues/243
ag123
Posts: 1881
Joined: Thu Dec 19, 2019 5:30 am
Answers: 30

Re: DMA for SPI

Post by ag123 »

I'd like to stress that what I'm doing are really experiments, these are all untested, unproven codes.

I did a round of re-factoring of the SPI_DMA class so now it looks like
https://github.com/ag88/stm32duino_spi_ ... I/SPIDMA.h
https://github.com/ag88/stm32duino_spi_ ... SPIDMA.cpp
https://github.com/ag88/stm32duino_spi_ ... DMA_F4XX.h
https://github.com/ag88/stm32duino_spi_ ... A_F4XX.cpp

SPIClass (this is the original SPIClass, but that many methods needs to be virtual and variables moved to protected so that derived classes can access)
^
SPI_DMA ( I try to implement the common SPI codes here, but there is a catch, some of the functions are pure virtual functions - no implementation)
^
SPI_DMAF4 (This is intended to be the SPI implementations for F4xx, and in the default implementation it is SPI1)
^
SPI_DMAF4_SPI2, SPI_DMAF4_SPI3 (These are further derived from SPI_DMAF4 and configurations (e.g. the DMA channel/stream etc) stored in the Class codes for SPI2 and SPI3)

While I'm trying to code this out, I noted a problem, the DMA peripheral are all different across the different series.
The generated codes from CubeMX/Cube IDE and HAL use similar looking structures, but there are different fields/symbols and some of the same symbols has a different *type*.
This is a 'headache' sort of as placing DMA inheritance 'higher' up the inheritance chain e.g. in SPIClass is bound to hit symbol not found or wrong types errors during build if the DMA initialization codes is in SPIClass or SPI_DMA. What this practically means is that the DMA initialization codes need to live in c++ class for each *series*, so that the #include or #ifdef if needed can go with the series. otherwise there'd be plenty of 'if-def' higher up.

This in practical terms means that the calling codes would look like

Code: Select all

#include "SPIDMA_F4XX.h" /* !!!! */

SPIClass spi1 = SPI_DMAF4_SPI1( ... parameters);
SPIClass spi2 = SPI_DMAF4_SPI2( ... parameters);

void setup() {
	spi1.begin();
	spi2.begin();
}

void loop() {
	uint8_t a = 1, b = 2;
	spi1.transfer(a);
	spi2.transfer(b);
}
this is a 'catch' sort of, it means the sketch/app needs to define spi for every different series, soc, and peripheral (i.e. spi1, spi2, spi3 etc) !
maybe that'd need to be 'macro'ed (#defined ) to SPI, in a variant etc.
stevestrong
Posts: 505
Joined: Fri Dec 27, 2019 4:53 pm
Answers: 9
Location: Munich, Germany
Contact:

Re: DMA for SPI

Post by stevestrong »

Can't you get inspiration from the libmaple implementation?
Composite
Posts: 8
Joined: Fri Feb 16, 2024 11:09 pm

Re: DMA for SPI

Post by Composite »

I have dma running with SPI, ADC, DAC on F4/ G4/ F7/ H7. CubeMX generated code works "as it is" under arduino IDE (official ST 1.8.9).
Have to write code myself only for double buffering with DMA, there was no examples a few years ago.
ag123
Posts: 1881
Joined: Thu Dec 19, 2019 5:30 am
Answers: 30

Re: DMA for SPI

Post by ag123 »

stevestrong wrote: Sun Jan 12, 2025 6:41 pm Can't you get inspiration from the libmaple implementation?
it is pretty similar
https://github.com/ag88/stm32duino_spi_ ... I/SPIDMA.h
https://github.com/ag88/stm32duino_spi_ ... SPIDMA.cpp

just that for now the calls are is going into 1 more layer of HAL instead of directly hitting registers for the single byte transfers.
and that the DMA transfer ones that deals with buffers are based on HAL

https://github.com/ag88/stm32duino_spi_ ... A.cpp#L216

Code: Select all

inline void SPI_DMA::transfer_async(const void *tx_buf, void *rx_buf, size_t count) {
	HAL_SPI_DMAResume(&_spi.handle);
	HAL_SPI_TransmitReceive_DMA(&_spi.handle, (uint8_t*) tx_buf, (uint8_t*) rx_buf, count);
}
a reason for doing this is that stm32 has a huge portfolio, many series and many soc with different features.
it turns out some of the 'high level' calls like HAL_SPI_TransmitReceive_DMA are defined same across the series(es).
the trouble is that for initialization, it is still necessary to deal directly with hardware.
the structs for DMA initialization are different across the series(es) because the hardware and peripheral bindings are different
https://github.com/ag88/stm32duino_spi_ ... XX.cpp#L31
https://github.com/ag88/stm32duino_spi_ ... XX.cpp#L46

This is for stm32f4(01/11), I've not yet checked other socs in the F4 series.

then that stm32f103, stm32f4xx, stm32g4xx, stm32h7{23,43}, stm32g030, stm32h5{62} all have different DMA hardware between the series(es).
The structs are similar but some of fields are different (e.g. some have a FIFO enable field), then that some have different types,
e.g. stm32 F4 has both channel and streams, while stm32 F1 only has channels, and the channel and instance fields has a different type.
which means that without if-defs compiling for one or the other will either hit undefined symbol errors or type mismatch errors.

The result is that DMA is implemented at the derived per-series class
https://github.com/ag88/stm32duino_spi_ ... DMA_F4XX.h

I think if-defs can't be avoided, but in <SPI.h>, so that #include can be specified like

Code: Select all

#if defined(STM32F4XX)

#include "SPIDMA_F4XX.h"
SPIClass SPI = SPI_DMAF4(parameters);

#elif defined(STM32F1XX)
...
#endif
SPIClass would be common SPI interface, by virtue of c++ inheritance.
The key is many libs target Arduino API SPIClass, they'd likely 'just works'.

The catch is that then every series that needs to be supported would need to have its own "SPIDMA_Fxxyy.h" implementation.

The use of c++ inheritance do reduce the amount of #if-defs and if-else within the codes, this lead to more compact codes and likely run faster. e.g. instead of

Code: Select all

if ( spi == SPI1 ) {
// do this
} else if ( spi == SPI2 ) {
// do that
} else ...
SPI1 , SPI2, SPI3 all have its own classes, you need to initialize by choosing the correct class
e.g. SPI_DMAF4SPI1
https://github.com/ag88/stm32duino_spi_ ... 3C7-L13C16
https://github.com/ag88/stm32duino_spi_ ... 4C1-L94C40

Code: Select all

typedef class SPI_DMAF4 SPI_DMAF4_SPI1;

SPIClass myspi = SPI_DMAF4_SPI1( ... parameters);
SPI_DMAF4_SPI2
https://github.com/ag88/stm32duino_spi_ ... F4XX.h#L97
SPI_DMAF4_SPI3
https://github.com/ag88/stm32duino_spi_ ... 4XX.h#L120

Initializations specific to SPI1, SPI2, SPI3 are encoded within the class source codes e.g. which DMA channel and stream and APB bus they bind to.
in this way that if-else as like above is avoided, i.e. the code don't have to decide at run time if it is SPI1, 2 or 3 etc.
It'd likely cost more memory and flash use though when multiple SPI peripherals are concurrently used.

using c++ inheritance has an added advantage in that one can override the implementation with your own codes.
e.g. if initDMA() has a conflict in some of the DMA channels / streams that you are using and that you want to use different config.
you can override initDMA() copy the codes and fix that for your particular use case. The rest of the codes if it works can still be left as is.

---
blurb:
I'd think if one makes do with if-def in <SPI.h>, one may even be able to make a 'unified' <SPI.h> that works across both 'official' core and 'libmaple' core, since all the methods are virtual and can be overridden.
https://github.com/ag88/stm32duino_spi_ ... SPIClass.h

for now these are all still experiments
Post Reply

Return to “Ideas & suggestions”