For sake of simplicity, in this text we will assume that erased state of FLASH is all-ones, ie. all bytes are 0xFF. This is not the case for 'L0 and 'L1 as has been described in Gotcha 98. As 'L0 does have ECC, the issue itself still pertains to this family, too.
Newer STM32 families are built using technologies with extremely small features, benefiting from the respectively increased circuit density. However, smaller transistors in FLASH also means a smaller volume of the floating gate where the charge representing programmed value of given bit is trapped, hence it means increased probability that noise during readout causes an incorrectly read bit.
To mitigate this problem, FLASH in these families is equipped with error detection and correction circuitry (ECC). FLASH is divided into granules (smallest portions which can be written, 32, 64 or 128 bits, depending on particular STM32 model). These granules are augmented by additional FLASH bits, invisible to the user, into which the error-check code is written, when the granule itself is written.
When the FLASH is read - regardless of whether code to be run is read from it, or data - the FLASH controller always reads the whole granule, performs the ECC check, if there is a single bit error, corrects it, and then returns to the system the requested portion of the granule.
If two or more bit error occurs, the error is uncorrectable. What happens in this case depends on the particular STM32 family/model. In most families, the uncorrectable error causes an NMI (nonmaskable interrupt), and then it's up to the user how this error is handled. In most families, also the single-bit error may throw a particular interrupt, if enabled.
There are several consequences of ECC on FLASH, which may be surprising to unaware users:
- In erased state, both visible and ECC portion of granule is all-ones, but while the particular ECC algorithm ST uses is not published, it is known that all-ones is not a valid ECC for all-ones data. In other words, reading data from erased unwritten FLASH granule results in ECC error. For most families the datasheet states, that this particular case results in a single-bit error, i.e. it does not throw NMI, but if the single-bit error interrupt is enabled, it will be invoked.
- Granules cannot be rewritten (or partially written).
Some STM32 models disable rewriting already written granules during FLASH programming and throw a FLASH-write error. Other models enable rewriting a granule, but as the hidden ECC portion was not all-ones, the newly written ECC is in fact bitwise-AND of the previous ECC and ECC belonging to the newly written data, thus it is likely to have zeros where it should have ones. In other words, subsequent reading of such rewritten granule very likely results in ECC failure and NMI.
This fact precludes schemes, where only a few bytes/bits of a granule are written from 1 to 0 at once, and later remaining 1s would be written to 0, e.g. implementing a "usage counter", with the aim to decrease FLASH wear and reduce needed erase cycles, increasing chip lifetime. This is not possible anymore.
- Granule written with all-1 (0xFF) is not an erased granule, and cannot be written again. This is surprising especially if user in program allocates an array into FLASH, initialized by 0xFF, with the intention to write to it later from the program. This array gets written using the programmer, and for the user it is indistinguishable from erased portion of FLASH; however, it is not writable without erasing it.
STM32 families can be roughly divided into the following groups, as far as their FLASH is concerned:
- 'F1, 'F0, 'F3 - no ECC
- 'F2, 'F4, 'F7 - no ECC, has OTP
- 'L4, 'G0, 'G4, 'L5 - has ECC, has OTP
- 'H7 - has ECC
- 'L1, 'L0 - erased at 0x00, has EEPROM, 'L0 has ECC, 'L1 RM mentions ECC only with EEPROM(?)