Key topics
- Chunking by structure and semantics, not only by token size.
- Hybrid retrieval (lexical + vector) and reranking for precision.
- Permission filtering before passage exposure to the model.
- Freshness controls, cache invalidation, and live fetch patterns.
- Citation coverage and groundedness metrics.
Common pitfalls
- Over-chunking or under-chunking leading to missing context.
- Ignoring metadata filters (tenant, role, recency) and returning noise.
- Returning too many passages and overwhelming the model.
- No evidence mapping from outputs to source sections.
Recommended practices
- Start with a curated golden set of queries and expected sources.
- Use rerankers and diversity selection to reduce redundancy.
- Require citations for key claims and extracted fields.
- Continuously monitor retrieval quality and drift.
This page is intended to be actionable for engineering teams. For platform-specific details, cross-reference /platform/agents, /platform/orchestration, and /platform/knowledge.