TF-IDF (Term Frequency)

Theoretical Foundations
TF-IDF
TF-IDF (Term Frequency) measures word importance in a document: the more frequent in that document (TF) and rarer in the entire collection (IDF).

TF-IDF (Term Frequency-Inverse Document Frequency) is a numerical statistic that reflects how important a word is to a document within a collection of documents.

For example, the word 'polysemy' has high IDF (rare), so it gives a stronger signal than common 'is'. In AI Search, new co-occurrences with high IDF (like 'SEO' + 'citation probability') provide better scores in reranking.

TF-IDF is an older algorithm than BM25, but still gives useful results in content analysis.

Source: AI Semantic SEO Expert, Robert Niechciał (sensai.io)