Skip to main content

LlamaIndex

Definition

LlamaIndex focuses on connecting LLMs to your data: ingestion, indexing, and querying. It provides flexible RAG pipelines, multiple index types, and evaluation tools.

It complements LangChain: LlamaIndex emphasizes the data layer (documents, embeddings, vector stores, indexing strategies). Use it when your priority is robust RAG over your own docs, APIs, or databases, with control over chunking, retrieval, and synthesis. Also supports agents and query engines.

How it works

Load data from documents, APIs, or databases into a unified document format. Build indices: vector index (embeddings + vector store), keyword index, or hybrid; you choose node parsers (chunking), embedding model, and index type. Query engines run retrieval (optionally with reranking) and then synthesis (the LLM answers from retrieved nodes). You can customize retrievers, node parsers, and response synthesis (e.g. tree summarization, simple concatenation). Evaluation tools (e.g. faithfulness, relevance) help tune chunking and retrieval for production RAG. Agents can use LlamaIndex query engines as tools inside LangChain or native agent loops.

Use cases

LlamaIndex fits when you need flexible RAG indexing, query engines, and evaluation over your own data and APIs.

  • RAG and document Q&A with flexible indexing and query engines
  • Connecting LLMs to internal data (docs, APIs, databases)
  • Evaluating and tuning retrieval and synthesis for production RAG

External documentation

See also