SEMANTIC_SIMILARITY
EmbeddingsSEMANTIC_SIMILARITY is an embedding task type that optimizes vectors for measuring semantic similarity between text pairs. In SEO, it has two key applications: duplicate detection (similarity near 1.0, e.g., pagination pages with identical content) and cannibalization detection (similarity 0.9-0.99, e.g., 'What is SEO' vs 'SEO Basics').
It's also useful for building internal linking based on nearest neighbors, where you look for pages with cosine similarity above 0.75-0.8. Unlike CLUSTERING, this task type compares text pairs rather than grouping them into clusters. By choosing SEMANTIC_SIMILARITY for site audits, you can scan 10,000 URLs in minutes and find all duplicates and cannibalization — what would take weeks manually.
In practice, start by comparing page titles (title tags): they're short and capture the topic well.