Semantic search & vector search for documents
Semantic search finds content by meaning, not exact keywords. In practice, text is converted to vectors (embeddings) with a neural model; queries and documents live in the same vector space, and nearest-neighbor search returns the most similar chunks. This is the retrieval layer behind most modern RAG and AI knowledge bases.
Semantic search vs keyword search
Keyword (BM25, inverted index) matches tokens and synonyms lists. It is fast and interpretable but struggles with paraphrases (“cancel subscription” vs “stop billing”). Semantic search captures intent when users don’t use the same words as your docs. Many production systems combine both: hybrid search plus reranking.
How vector search works
Each text chunk is embedded into a high-dimensional vector. The query is embedded with the same model. Similarity is usually cosine similarity or dot product. Results are the top-k vectors closest to the query. Quality depends on embedding model choice, chunking (chunking guide), and domain fit.
Reranking
First-stage retrieval casts a wide net (high recall). A second stage—cross-encoder reranking or an LLM—reorders candidates for precision before generation. This reduces noise in the LLM context window.
WeKnora
WeKnora includes semantic retrieval, reranking options, and integration with LLMs so you can ship document Q&A without building vector search from scratch.
Semantic retrieval features
All guides