MTEB Leaderboard
EmbeddingsMTEB Leaderboard (Massive Text Embedding Benchmark) ranks embedding models tested on standardized benchmarks covering retrieval, clustering, similarity, and classification tasks. It helps you choose the best model for specific SEO use cases — not every model excels at everything. Critically important is checking results for Polish language performance, as many models handle English excellently but perform poorly with Polish.
Recommended models include Jina (strong multilingual), Gemini text-embedding-004 (768 dimensions, good Polish support), and OpenAI (1536/3072 dimensions). MTEB is available on Hugging Face and regularly updated with new models. In practice, before indexing thousands of pages, test 2-3 models on a sample of 50 Polish titles and compare clustering quality — differences can be significant.