Internal Linking (nearest neighbors)
EmbeddingsInternal linking based on nearest neighbors is a strategy for building internal links using embeddings. For each URL, we search for 10 nearest neighbors in vector space (cosine similarity > 0.8), which gives a list of URL pairs to link. Internal linking based on nearest neighbors is a fundamental change compared to lexical linking—we don't look for word matches, but similarity of MEANING. The next step after identifying pairs is analyzing anchor texts and linking context, and for greater precision you can add reranking (from top 100 select top 10 most relevant).
This method is extended by knowledge graphs, where SHARES_ATTRIBUTE provides not only WHAT to link, but WHY and with what strength. For example, for e-commerce with 10,000 products, embeddings identify 100 potentially related products, and the reranker selects 5-10 truly worth linking from that hundred.