🤖 LangChain Flask Bot - RAG-Powered Technical Course Assistant

A Flask-based intelligent chatbot that uses Retrieval-Augmented Generation (RAG) to answer questions about technical courses. Built with LangChain, Hugging Face models, and FAISS for efficient vector search and contextual responses.

🎯 What It Does

This application creates an AI-powered Q&A system that:

🌐 Scrapes real course data from technical websites
🔍 Understands user queries through semantic search
🤖 Generates accurate, contextual answers using AI
⚡ Serves responses via a clean REST API

🏗️ Architecture Overview

graph TB
    subgraph "🔧 Setup Phase"
        A1[🌐 Web Scraping<br/>Brainlox Courses] --> A2[📄 Text Processing<br/>Clean & Chunk]
        A2 --> A3[🔢 Generate Embeddings<br/>384-dimensional vectors]
        A3 --> A4[💾 FAISS Index<br/>Vector Database]
    end
    
    subgraph "🚀 Runtime Phase"
        B1[📝 User Query] --> B2[🔍 Vector Search<br/>FAISS Similarity]
        B2 --> B3[📋 Retrieved Context<br/>Relevant Chunks]
        B3 --> B4[🤖 LLM Generation<br/>Falcon 7B]
        B4 --> B5[📤 JSON Response]
    end
    
    A4 -.-> B2
    
    style A1 fill:#e3f2fd
    style B1 fill:#e8f5e8
    style B4 fill:#fff3e0

🔄 RAG System Flow

R.A.G. = Retrieval + Augmentation + Generation

flowchart TD
    A[👤 User Query<br/>'What Python courses are available?'] --> B[🔍 RETRIEVAL<br/>Convert to vector & search FAISS]
    B --> C[📋 Top 3 relevant course chunks]
    C --> D[📈 AUGMENTATION<br/>Combine query + context]
    D --> E[🎯 GENERATION<br/>Falcon 7B generates answer]
    E --> F[📤 Final Response<br/>Contextual course information]
    
    style A fill:#e1f5fe
    style B fill:#f3e5f5
    style D fill:#e8f5e8
    style E fill:#fff3e0
    style F fill:#fce4ec

🛠️ Technology Stack

Component	Technology	Purpose
Web Framework	Flask	REST API server
AI Orchestration	LangChain	RAG pipeline management
Vector Database	FAISS	Fast similarity search
Language Model	Falcon 7B (Hugging Face)	Text generation
Embeddings	all-MiniLM-L6-v2	Semantic vector creation
Web Scraping	WebBaseLoader	Data collection

📋 Prerequisites

Python 3.8+
Hugging Face Account (free)
Internet Connection (for model access)
4GB+ RAM (for embeddings)

🚀 Quick Start

1. Clone Repository

git clone https://github.com/Anshuman-git-code/Langchain_Flask_Bot.git
cd Langchain_Flask_Bot

2. Set Up Environment

# Create virtual environment
python -m venv venv

# Activate virtual environment
# On Windows:
venv\Scripts\activate
# On Mac/Linux:
source venv/bin/activate

# Install dependencies
pip install -r requirements.txt

3. Configure Environment Variables

Create a .env file in the project root:

HUGGINGFACEHUB_API_TOKEN=your_huggingface_token_here
USER_AGENT=TechnicalCoursesBot/1.0

Get your Hugging Face token:

Go to Hugging Face
Create account → Settings → Access Tokens
Create new token with "Read" permission

4. Generate FAISS Index

python generate_faiss.py

Expected output:

Loading documents from https://brainlox.com/courses/category/technical
Splitting documents into chunks...
Generating embeddings...
Creating FAISS index...
FAISS index created successfully!

5. Start the Application

python app.py

Server will start at: http://localhost:5003

🎮 Usage

API Endpoint

POST /chat
Content-Type: application/json

Request Format

{
  "query": "What Python courses are available?"
}

Response Format

{
  "response": "Based on the available courses, here are the Python-related options: 1. Python for Beginners - covers basic syntax and programming concepts, 2. Advanced Python Development - focuses on frameworks like Django and Flask..."
}

Test with cURL

curl -X POST http://localhost:5003/chat \
  -H "Content-Type: application/json" \
  -d '{"query": "What are the best web development frameworks?"}'

Test with Python

import requests
import json

url = "http://localhost:5003/chat"
data = {"query": "Tell me about machine learning courses"}

response = requests.post(url, json=data)
print(json.dumps(response.json(), indent=2))

🔍 How It Works

Setup Phase (generate_faiss.py)

🌐 Web Scraping

loader = WebBaseLoader(["https://brainlox.com/courses/category/technical"])
docs = loader.load()

✂️ Text Chunking

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=500,    # Each chunk max 500 characters
    chunk_overlap=50   # 50 characters overlap
)

🔢 Vector Embeddings

embedding_model = HuggingFaceEmbeddings(
    model_name="sentence-transformers/all-MiniLM-L6-v2"
)

💾 FAISS Index Creation

vectorstore = FAISS.from_documents(documents, embedding_model)
vectorstore.save_local("faiss_index")

Runtime Phase (app.py)

🔍 Query Processing
- Convert user query to 384-dimensional vector
- Search FAISS index for similar content
📋 Context Retrieval
- Retrieve top 3 most relevant course chunks
- Combine with original query
🤖 AI Generation
- Send enhanced prompt to Falcon 7B
- Generate contextual response
📤 Response Delivery
- Format as JSON
- Return to client

🎯 Key Concepts

What is RAG?

Retrieval-Augmented Generation combines:

Retrieval: Finding relevant information from a knowledge base
Augmentation: Enhancing queries with retrieved context
Generation: Creating responses using both query and context

Why RAG?

✅ Accuracy: Responses based on real data, not hallucinations
✅ Relevance: Semantic search finds contextually similar content
✅ Up-to-date: Knowledge base can be updated without retraining
✅ Transparency: Can trace responses back to source documents

Vector Embeddings

Text is converted to numerical vectors that capture semantic meaning:

"Python programming" → [0.1, -0.3, 0.8, 0.2, ...]
"Java development"   → [0.2, -0.2, 0.7, 0.1, ...]

Similar concepts have similar vectors, enabling semantic search.

FAISS (Facebook AI Similarity Search)

Ultra-fast vector similarity search
Handles millions of vectors efficiently
Supports various distance metrics (L2, cosine, etc.)

📁 Project Structure

Langchain_Flask_Bot/
├── app.py                 # Main Flask application
├── generate_faiss.py      # FAISS index generation
├── requirements.txt       # Python dependencies
├── .env                   # Environment variables
├── faiss_index/          # Generated vector database
│   ├── index.faiss       # FAISS index file
│   └── index.pkl         # Metadata
├── README.md             # This file
└── .gitignore           # Git ignore rules

🔧 Configuration

Environment Variables

Variable	Description	Required
`HUGGINGFACEHUB_API_TOKEN`	Hugging Face API token	Yes
`USER_AGENT`	Custom user agent string	No

Model Configuration

# Embedding model (can be changed)
embedding_model = HuggingFaceEmbeddings(
    model_name="sentence-transformers/all-MiniLM-L6-v2"
)

# Language model (can be changed)
llm = HuggingFaceEndpoint(
    repo_id="tiiuae/falcon-7b-instruct",
    task="text-generation"
)

Chunking Parameters

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=500,        # Adjust based on your content
    chunk_overlap=50       # Prevent context loss
)

🚀 Deployment Options

Local Development

python app.py
# Access at http://localhost:5003

Production Deployment

Docker

FROM python:3.9-slim

WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt

COPY . .
RUN python generate_faiss.py

EXPOSE 5003
CMD ["python", "app.py"]

AWS Lambda

from mangum import Mangum
handler = Mangum(app)

Heroku

# Procfile
web: python app.py

📊 Performance Considerations

Memory Usage

FAISS Index: ~50MB for 1000 documents
Embedding Model: ~100MB in memory
Flask App: ~50MB base memory

Response Times

Vector Search: ~10-50ms
LLM Generation: ~1-3 seconds
Total Response: ~2-4 seconds

Optimization Tips

Caching: Cache frequent queries
Async Processing: Use FastAPI for concurrent requests
Index Optimization: Use IVF or HNSW for large datasets
Model Optimization: Consider smaller models for faster inference

🛡️ Security & Best Practices

Environment Security

✅ API keys in environment variables
✅ No hardcoded credentials
✅ .env file in .gitignore

Input Validation

if not user_query or not user_query.strip():
    return jsonify({"error": "No query provided"}), 400

Error Handling

try:
    response = qa.invoke(user_query)
    return jsonify({"response": response})
except Exception as e:
    logging.error(f"Error: {str(e)}")
    return jsonify({"error": "Internal server error"}), 500

🔍 Troubleshooting

Common Issues

1. FAISS Index Not Found

FileNotFoundError: FAISS index directory 'faiss_index' not found

Solution: Run python generate_faiss.py first

2. Hugging Face API Error

HTTP 401: Unauthorized

Solution: Check your HUGGINGFACEHUB_API_TOKEN in .env

3. Memory Issues

RuntimeError: CUDA out of memory

Solution: Use CPU-only models or reduce batch size

4. Slow Response Times

Check internet connection
Consider using local models
Implement caching

Debug Mode

# Enable debug logging
logging.basicConfig(level=logging.DEBUG)

# Run Flask in debug mode
app.run(debug=True)

📈 Scaling & Improvements

Immediate Improvements

Add caching layer (Redis)
Implement rate limiting
Add health check endpoint
Improve error messages

Advanced Features

Streaming responses for real-time feel
Multiple data sources beyond Brainlox
User session management
Query analytics and logging

Production Scaling

Load balancing with multiple instances
Database migration to managed services
CDN integration for static assets
Monitoring & alerting setup

🧪 Testing

Unit Tests

import unittest
from app import app

class TestChatAPI(unittest.TestCase):
    def setUp(self):
        self.app = app.test_client()
    
    def test_chat_endpoint(self):
        response = self.app.post('/chat', 
            json={"query": "What is Python?"})
        self.assertEqual(response.status_code, 200)

Integration Tests

# Test the complete pipeline
python -m pytest tests/

🤝 Contributing

Fork the repository
Create feature branch: git checkout -b feature/new-feature
Commit changes: git commit -am 'Add new feature'
Push to branch: git push origin feature/new-feature
Submit Pull Request

📚 Learning Resources

Core Concepts

RAG Architecture

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

LangChain for the excellent AI framework
Hugging Face for democratizing AI models
Facebook AI for the FAISS library
Brainlox for the course data

📧 Contact

GitHub: @Anshuman-git-code
Email: [email protected]

⭐ Star this repository if you found it helpful!

🐛 Found a bug? Create an issue

🚀 Want to contribute? Check our guidelines

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
README.md		README.md
app.py		app.py
generate_faiss.py		generate_faiss.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation