Keyword Clustering (Embeddings + K-means)

Semantic Clustering
Keyword Clusterer (Embeddings + K-means)Keyword Clusterer
Keyword Clustering (Embeddings + K-means) — Second stage in clustering pipeline that groups keywords using embeddings and K-means algorithm.

Keyword clustering is the second stage in the clustering pipeline, grouping keywords into clusters using embeddings (Gemini with task_type=CLUSTERING) and the K-means algorithm. It automatically selects the optimal number of clusters k based on Silhouette Score — testing k from 2 to 20 and choosing the value with the highest score, eliminating the guesswork of 'how many clusters should I have'. The principle 'LLM for reasoning, Python for computation' is key here: clustering 500 keywords with embeddings + K-means in Python is hundreds of times cheaper than using an LLM and produces deterministic (repeatable) results. The output is a JSON file with each keyword assigned to a cluster, ready for the next step — cluster naming.

In practice, if the Silhouette Score is low (below 0.3), try a different embedding model or check whether your keyword pool isn't too narrow thematically.

Source: AI Semantic SEO Expert, Robert Niechciał (sensai.io)