Building a Production RAG System: From Hybrid Search to Agentic Retrieval

Tue, 31 Mar 2026 00:00:00 +0900

There’s no shortage of RAG tutorials online, but most stop at the “hello world” stage: embed some documents, throw them into a vector database, retrieve top-k, and feed them to an LLM. That works for demos. It falls apart in production.

I’ve been building a RAG system at work — one that searches across thousands of internal documents (engineering issues, SDK source code, design specs, technical docs) and serves results to LLM agents via MCP. I can’t go into the specifics of the product or infrastructure, but the technical challenges and lessons are universal enough to share.

MCP on onseok

Building a Production RAG System: From Hybrid Search to Agentic Retrieval