Cosine Similarity (0–1)
EmbeddingsCosine similarity is the primary metric for comparing embeddings in SEO, measuring the angle between two vectors on a scale from 0 to 1. A value of 1.0 means identical meaning (potential duplicate), 0.9–0.99 signals cannibalization, and 0.75–0.9 indicates strong topical connection ideal for internal linking.
It's preferred over Euclidean distance because it measures vector direction (meaning), not length — two articles can have different volumes but identical topics. Practical SEO applications include: comparing page titles, detecting duplicates (similarity 1.0), identifying cannibalization (0.9–0.99), and finding nearest neighbors for internal linking. In Supabase you calculate it with the SQL <=> operator (cosine distance), and in Python with the cosine_similarity function from scikit-learn.
By analogy, cosine similarity is like comparing the driving direction of two cars — even if one drives faster (longer vector), both are heading in the same direction (same topic).