RETRIEVAL_DOCUMENT
EmbeddingsRETRIEVAL_DOCUMENT is an embedding task type that optimizes vectors for representing documents in RAG systems — long, informational content. Used when indexing content: each article or chunk is vectorized with this type so the model better captures its informational content. Works in tandem with RETRIEVAL_QUERY (on the query side).
The key principle is that the same embedding model must be used for both indexing and searching — if you indexed content with Gemini using task type RETRIEVAL_DOCUMENT, you must vectorize queries with Gemini using RETRIEVAL_QUERY. Mixing models (e.g., indexing with Jina, searching with OpenAI) produces worthless results because vectors from different models live in different spaces. In practice, when indexing a large site (e.g., 1000 pages), generate embeddings in batches of 50-100 and save to CSV/database after each batch to avoid losing progress on API errors.