Content Pruning (outliers)

Embeddings
Content Pruning
Content Pruning (outliers): identifying pages topically distant from a site's centroid (outliers) as candidates for removal or relocation.

Content pruning is a strategy for identifying and removing content that's topically distant from a site's centroid — so-called outliers. You measure the distance of each page from the site's embedding centroid, and those most distant become pruning candidates. These might be sponsored articles, off-topic posts, or outdated content that 'dilutes' the domain's topical specialization. Removing outliers through content pruning improves Site Focus Score and reduces Site Radius, strengthening topical authority in Google's eyes.

For example, a law firm client has a blog with 200 articles, Focus 0.72, Radius 0.35. After removing 30 peripheral articles, Focus rises to 0.85. Importantly, this is data-driven pruning (based on embeddings), not intuition: you have concrete numbers to justify content removal to clients. The Senuto case study confirms this relationship. In practice, instead of deleting, consider moving outliers to a separate domain or subdomain.

Source: AI Semantic SEO Expert, Robert Niechciał (sensai.io)