BM25 (Saturation and Length)

Theoretical Foundations
BM25BM25 content extraction
BM25 (Saturation and Length) is an advanced ranking algorithm accounting for term saturation and length penalties in document scoring.

Advanced ranking algorithm accounting for term saturation (successive word repetitions have diminishing effect: like salt in soup, 2 pinches good, 20 pinches inedible) and document length penalties (longer text isn't automatically better; what matters is density of rare, specialized words). BM25 is the standard in lexical retrieval used by Google.

Also appears as a content extraction method in Crawl4AI — text blocks compared with H1 using BM25 method. BM25 explains why keyword stuffing doesn't work: after reaching saturation threshold, additional keyword repetitions don't help and may hurt.

In practice, instead of repeating the same phrase 20 times, use it 3-5 times and fill the rest with specialized terms having high IDF. This gives better signal than brute-force repetitions.

Source: AI Semantic SEO Expert, Robert Niechciał (sensai.io)