Agentic RAG - How we will build LLM applications in the future
I just watched a video of Jerry Liu, founder of LlamaIndex, explaining agentic RAG. Here’s my understanding.
The most common applications of LLMs are basically “knowledge assistants,” whether in search or in a conversational interface.
The next step many people are excited about isn’t just the synthesis part of knowledge but the agentic part, putting it into action for you.
RAG was just the beginning. Basic RAG is boring; it's just a bit of an enhanced search.
So, the key question is, how do we use the LLMs to build better knowledge assistants? Jerry outlines the next three steps on the horizon that any developer can implement today:
Any LLM app is only as good as its data. You must have good tools to create and retrieve data. This is a necessary component! GIGO.
Advanced agents that don’t just RAG: they need dynamic planning, using tools, and conversation memory. Let the agent not just pass on your question but instead use the LLMs and other APIs extensively throughout his understanding process (= agentic RAG.)
But those agents usually do better as specialist agents! So, multi-agents can do substantially better than single agents. They can be parallelized. If you ask a question, some logic or an agent decides which agent is best suited to generate the knowledge or even act on it. This can happen in loops, in pieces, or in parallel.
LLamaIndex provides a complete roadmap to tackle most of these challenges, which is great to know! But of course, there are many other projects out there, including a bunch of recently released multi-agent frameworks, like CrewAI or LangGraph. Here’s a short list.
Watch the full video for a more complete take on things: