| layout | default |
|---|---|
| title | Haystack Tutorial |
| nav_order | 23 |
| has_children | true |
| format_version | v2 |
Project: Haystack — An open-source framework for building production-ready LLM applications, RAG pipelines, and intelligent search systems.
Haystack is increasingly relevant for developers working with modern AI/ML infrastructure. Project: Haystack — An open-source framework for building production-ready LLM applications, RAG pipelines, and intelligent search systems, and this track helps you understand the architecture, key patterns, and production considerations.
This track focuses on:
- understanding getting started with haystack
- understanding document stores
- understanding retrievers & search
- understanding generators & llms
Haystack is an open-source LLM framework by deepset for building composable AI pipelines. It provides a modular, component-based architecture that combines retrieval, generation, and evaluation into production-ready workflows. Haystack supports dozens of LLM providers, vector databases, and retrieval strategies out of the box.
| Feature | Description |
|---|---|
| Pipeline System | Directed graph of components with typed inputs/outputs and automatic validation |
| RAG | First-class retrieval-augmented generation with hybrid search (BM25 + embedding) |
| Multi-Provider | OpenAI, Anthropic, Cohere, Google, Hugging Face, Ollama, and more |
| Document Stores | In-memory, Elasticsearch, OpenSearch, Pinecone, Qdrant, Weaviate, Chroma, pgvector |
| Evaluation | Built-in metrics (MRR, MAP, NDCG) and LLM-based evaluation components |
| Custom Components | @component decorator for building reusable pipeline nodes with typed I/O |
- repository:
deepset-ai/haystack - stars: about 24.6k
- latest release:
v2.26.1(published 2026-03-20)
graph TB
subgraph Ingestion["Ingestion Pipeline"]
FILES[File Converters]
SPLIT[Document Splitter]
EMBED_D[Document Embedder]
WRITER[Document Writer]
end
subgraph Store["Document Stores"]
MEM[In-Memory]
ES[Elasticsearch]
PG[pgvector]
VEC[Pinecone / Qdrant / Weaviate]
end
subgraph Query["Query Pipeline"]
EMBED_Q[Query Embedder]
BM25[BM25 Retriever]
EMB_RET[Embedding Retriever]
JOINER[Document Joiner]
RANKER[Ranker]
PROMPT[Prompt Builder]
GEN[Generator / LLM]
end
FILES --> SPLIT --> EMBED_D --> WRITER
WRITER --> Store
Store --> BM25
Store --> EMB_RET
EMBED_Q --> EMB_RET
BM25 --> JOINER
EMB_RET --> JOINER
JOINER --> RANKER --> PROMPT --> GEN
| Chapter | Topic | What You'll Learn |
|---|---|---|
| 1. Getting Started | Setup | Installation, first RAG pipeline, architecture overview |
| 2. Document Stores | Storage | Store backends, indexing, preprocessing, multi-store patterns |
| 3. Retrievers & Search | Retrieval | BM25, embedding, hybrid search, filtering, re-ranking |
| 4. Generators & LLMs | Generation | Multi-provider LLMs, prompt engineering, streaming, chat |
| 5. Pipelines & Workflows | Composition | Pipeline graph, branching, loops, serialization, async |
| 6. Evaluation & Optimization | Quality | Retrieval metrics, LLM evaluation, A/B testing, optimization |
| 7. Custom Components | Extensibility | @component decorator, typed I/O, testing, packaging |
| 8. Production Deployment | Operations | REST API, Docker, Kubernetes, monitoring, scaling |
| Component | Technology |
|---|---|
| Language | Python 3.9+ |
| Pipeline Engine | Custom directed graph with topological execution |
| Serialization | YAML / JSON pipeline definitions |
| Embeddings | Sentence Transformers, OpenAI, Cohere, Fastembed |
| Vector Search | FAISS, Pinecone, Qdrant, Weaviate, Chroma, pgvector |
| Text Search | Elasticsearch, OpenSearch, BM25 (in-memory) |
| LLM Providers | OpenAI, Anthropic, Google, Cohere, Hugging Face, Ollama |
| API Layer | Hayhooks (FastAPI-based pipeline serving) |
Ready to begin? Start with Chapter 1: Getting Started.
Built with insights from the Haystack repository and community documentation.
- Core architecture and key abstractions
- Practical patterns for production use
- Integration and extensibility approaches
- Start Here: Chapter 1: Getting Started with Haystack
- Back to Main Catalog
- Browse A-Z Tutorial Directory
- Search by Intent
- Explore Category Hubs
- Chapter 1: Getting Started with Haystack
- Chapter 2: Document Stores
- Chapter 3: Retrievers & Search
- Chapter 4: Generators & LLMs
- Chapter 5: Pipelines & Workflows
- Chapter 6: Evaluation & Optimization
- Chapter 7: Custom Components
- Chapter 8: Production Deployment
Generated by AI Codebase Knowledge Builder