Agentic RAG over an Enterprise Documentation Corpus

Context

Engineers spend an enormous amount of time locating the right passage across a sprawling technical-documentation corpus. I built an agentic retrieval system to collapse that search from a manual hunt into a grounded answer with citations.

What I built

Hybrid retrieval — a Qdrant vector store over BAAI/bge embeddings combined with a homegrown knowledge-graph RAG layer (entity and co-occurrence graph with community detection) so the system reasons over how documents relate, not just which are individually similar.
CUDA-accelerated OCR ingestion to bring scanned and image-based documents into the corpus cleanly.
A least-privilege, read-only tool bridge (MCP) as the enabling control — the agent can retrieve and cite, but never write, and access is scoped and fail-closed by default.

Measured result

Specification-research time dropped materially — a self-measured, work-validated result. The specific figure is available on request. The capability moved beyond me: colleagues now use it.

The corpus itself and the systems it lives in are deliberately unnamed here. This entry is capability-focused by design.