STM32 timeline and peripherals changes
We concentrate on the "general purpose" STM32, omitting 'WB/'WL (wireless).
We don't treat 'L4+ as a separate sub-family here, as it shares most characteristics with 'L4.
The original idea was to fill out the table with various modules' versions.
However, this is a very hard task, and honestly, it should be ST doing it (link points to a lost "Idea" post in the STM32 forum, where several users requested this, and as most "Ideas", has been ignored by ST).
Below the table there are links to gotcha articles which deal with variants of some of the peripherals. Also, a significant UART/I2C/SPI change happened at the point of introduction of 'F0/'F3.
FLASH variants across families discussed here. RTC variant across families discussed here.
GPIO on AHB is faster, but GPIO on APB in 'F1 allows bit-banding.
IOPORT is a dedicated port on the Cortex-M0+ processor, allowing faster access from processor to GPIO at the cost of preventing DMA from accessing it. Note, that 'U0 while based on Cortex-M0+, its GPIO are on AHB rather than IOPORT.
In 'H7, GPIO is on a secondary bus-matrix, removed from processor and DMAs (except BDMA) through two busmatrices and respective bridges, making "manual toggling" of GPIO surprisingly slow.
The single-port DMA is described in AN2548, dual-port DMA in AN4031.
The DMA column also indicates, how requests from peripherals are steered to DMA: the simplest approach is that requests from several peripherals to an individual DMA channel are simply ORed together;
in more advanced models there is a request-selecting multiplexer controlled by bits in DMA registers; and in the newest models there is a standalone DMAMUX unit,
effectively implementing a full matrix between requests from various peripherals and the set of available DMA channels (plus some additional features).
In 'H7, there's one advanced MDMA on the AXIM bus matrix, two dual-port DMA and one single-port DMA (as BDMA).
In 'U5 and 'H5, the DMAs are very different from rest of the STM32. An update to the original single/dual port DMAs, with substantially extended features, GPDMA is dual-port, LPDMA is single-port.
As far as requests go, GPDMA and LPDMA both contain an embedded unit similar to DMAMUX.
Folowing "10th anniversary" infographics from 2017 (i.e. 'G0/'G4/'L5/'U5/'C0/'H5/'U0 missing) linked from community.st.com: