A Flask-based intelligent chatbot that uses Retrieval-Augmented Generation (RAG) to answer questions about technical courses. Built with LangChain, Hugging Face models, and FAISS for efficient vector search and contextual responses.
This application creates an AI-powered Q&A system that:
- 🌐 Scrapes real course data from technical websites
- 🔍 Understands user queries through semantic search
- 🤖 Generates accurate, contextual answers using AI
- ⚡ Serves responses via a clean REST API
graph TB
subgraph "🔧 Setup Phase"
A1[🌐 Web Scraping<br/>Brainlox Courses] --> A2[📄 Text Processing<br/>Clean & Chunk]
A2 --> A3[🔢 Generate Embeddings<br/>384-dimensional vectors]
A3 --> A4[💾 FAISS Index<br/>Vector Database]
end
subgraph "🚀 Runtime Phase"
B1[📝 User Query] --> B2[🔍 Vector Search<br/>FAISS Similarity]
B2 --> B3[📋 Retrieved Context<br/>Relevant Chunks]
B3 --> B4[🤖 LLM Generation<br/>Falcon 7B]
B4 --> B5[📤 JSON Response]
end
A4 -.-> B2
style A1 fill:#e3f2fd
style B1 fill:#e8f5e8
style B4 fill:#fff3e0
flowchart TD
A[👤 User Query<br/>'What Python courses are available?'] --> B[🔍 RETRIEVAL<br/>Convert to vector & search FAISS]
B --> C[📋 Top 3 relevant course chunks]
C --> D[📈 AUGMENTATION<br/>Combine query + context]
D --> E[🎯 GENERATION<br/>Falcon 7B generates answer]
E --> F[📤 Final Response<br/>Contextual course information]
style A fill:#e1f5fe
style B fill:#f3e5f5
style D fill:#e8f5e8
style E fill:#fff3e0
style F fill:#fce4ec
| Component | Technology | Purpose |
|---|---|---|
| Web Framework | Flask | REST API server |
| AI Orchestration | LangChain | RAG pipeline management |
| Vector Database | FAISS | Fast similarity search |
| Language Model | Falcon 7B (Hugging Face) | Text generation |
| Embeddings | all-MiniLM-L6-v2 | Semantic vector creation |
| Web Scraping | WebBaseLoader | Data collection |
- Python 3.8+
- Hugging Face Account (free)
- Internet Connection (for model access)
- 4GB+ RAM (for embeddings)
git clone https://github.com/Anshuman-git-code/Langchain_Flask_Bot.git
cd Langchain_Flask_Bot# Create virtual environment
python -m venv venv
# Activate virtual environment
# On Windows:
venv\Scripts\activate
# On Mac/Linux:
source venv/bin/activate
# Install dependencies
pip install -r requirements.txtCreate a .env file in the project root:
HUGGINGFACEHUB_API_TOKEN=your_huggingface_token_here
USER_AGENT=TechnicalCoursesBot/1.0Get your Hugging Face token:
- Go to Hugging Face
- Create account → Settings → Access Tokens
- Create new token with "Read" permission
python generate_faiss.pyExpected output:
Loading documents from https://brainlox.com/courses/category/technical
Splitting documents into chunks...
Generating embeddings...
Creating FAISS index...
FAISS index created successfully!
python app.pyServer will start at: http://localhost:5003
POST /chat
Content-Type: application/json
{
"query": "What Python courses are available?"
}{
"response": "Based on the available courses, here are the Python-related options: 1. Python for Beginners - covers basic syntax and programming concepts, 2. Advanced Python Development - focuses on frameworks like Django and Flask..."
}curl -X POST http://localhost:5003/chat \
-H "Content-Type: application/json" \
-d '{"query": "What are the best web development frameworks?"}'import requests
import json
url = "http://localhost:5003/chat"
data = {"query": "Tell me about machine learning courses"}
response = requests.post(url, json=data)
print(json.dumps(response.json(), indent=2))-
🌐 Web Scraping
loader = WebBaseLoader(["https://brainlox.com/courses/category/technical"]) docs = loader.load()
-
✂️ Text Chunking
text_splitter = RecursiveCharacterTextSplitter( chunk_size=500, # Each chunk max 500 characters chunk_overlap=50 # 50 characters overlap )
-
🔢 Vector Embeddings
embedding_model = HuggingFaceEmbeddings( model_name="sentence-transformers/all-MiniLM-L6-v2" )
-
💾 FAISS Index Creation
vectorstore = FAISS.from_documents(documents, embedding_model) vectorstore.save_local("faiss_index")
-
🔍 Query Processing
- Convert user query to 384-dimensional vector
- Search FAISS index for similar content
-
📋 Context Retrieval
- Retrieve top 3 most relevant course chunks
- Combine with original query
-
🤖 AI Generation
- Send enhanced prompt to Falcon 7B
- Generate contextual response
-
📤 Response Delivery
- Format as JSON
- Return to client
Retrieval-Augmented Generation combines:
- Retrieval: Finding relevant information from a knowledge base
- Augmentation: Enhancing queries with retrieved context
- Generation: Creating responses using both query and context
- ✅ Accuracy: Responses based on real data, not hallucinations
- ✅ Relevance: Semantic search finds contextually similar content
- ✅ Up-to-date: Knowledge base can be updated without retraining
- ✅ Transparency: Can trace responses back to source documents
Text is converted to numerical vectors that capture semantic meaning:
"Python programming" → [0.1, -0.3, 0.8, 0.2, ...]
"Java development" → [0.2, -0.2, 0.7, 0.1, ...]
Similar concepts have similar vectors, enabling semantic search.
- Ultra-fast vector similarity search
- Handles millions of vectors efficiently
- Supports various distance metrics (L2, cosine, etc.)
Langchain_Flask_Bot/
├── app.py # Main Flask application
├── generate_faiss.py # FAISS index generation
├── requirements.txt # Python dependencies
├── .env # Environment variables
├── faiss_index/ # Generated vector database
│ ├── index.faiss # FAISS index file
│ └── index.pkl # Metadata
├── README.md # This file
└── .gitignore # Git ignore rules
| Variable | Description | Required |
|---|---|---|
HUGGINGFACEHUB_API_TOKEN |
Hugging Face API token | Yes |
USER_AGENT |
Custom user agent string | No |
# Embedding model (can be changed)
embedding_model = HuggingFaceEmbeddings(
model_name="sentence-transformers/all-MiniLM-L6-v2"
)
# Language model (can be changed)
llm = HuggingFaceEndpoint(
repo_id="tiiuae/falcon-7b-instruct",
task="text-generation"
)text_splitter = RecursiveCharacterTextSplitter(
chunk_size=500, # Adjust based on your content
chunk_overlap=50 # Prevent context loss
)python app.py
# Access at http://localhost:5003FROM python:3.9-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
RUN python generate_faiss.py
EXPOSE 5003
CMD ["python", "app.py"]from mangum import Mangum
handler = Mangum(app)# Procfile
web: python app.py- FAISS Index: ~50MB for 1000 documents
- Embedding Model: ~100MB in memory
- Flask App: ~50MB base memory
- Vector Search: ~10-50ms
- LLM Generation: ~1-3 seconds
- Total Response: ~2-4 seconds
- Caching: Cache frequent queries
- Async Processing: Use FastAPI for concurrent requests
- Index Optimization: Use IVF or HNSW for large datasets
- Model Optimization: Consider smaller models for faster inference
- ✅ API keys in environment variables
- ✅ No hardcoded credentials
- ✅ .env file in .gitignore
if not user_query or not user_query.strip():
return jsonify({"error": "No query provided"}), 400try:
response = qa.invoke(user_query)
return jsonify({"response": response})
except Exception as e:
logging.error(f"Error: {str(e)}")
return jsonify({"error": "Internal server error"}), 500FileNotFoundError: FAISS index directory 'faiss_index' not found
Solution: Run python generate_faiss.py first
HTTP 401: Unauthorized
Solution: Check your HUGGINGFACEHUB_API_TOKEN in .env
RuntimeError: CUDA out of memory
Solution: Use CPU-only models or reduce batch size
- Check internet connection
- Consider using local models
- Implement caching
# Enable debug logging
logging.basicConfig(level=logging.DEBUG)
# Run Flask in debug mode
app.run(debug=True)- Add caching layer (Redis)
- Implement rate limiting
- Add health check endpoint
- Improve error messages
- Streaming responses for real-time feel
- Multiple data sources beyond Brainlox
- User session management
- Query analytics and logging
- Load balancing with multiple instances
- Database migration to managed services
- CDN integration for static assets
- Monitoring & alerting setup
import unittest
from app import app
class TestChatAPI(unittest.TestCase):
def setUp(self):
self.app = app.test_client()
def test_chat_endpoint(self):
response = self.app.post('/chat',
json={"query": "What is Python?"})
self.assertEqual(response.status_code, 200)# Test the complete pipeline
python -m pytest tests/- Fork the repository
- Create feature branch:
git checkout -b feature/new-feature - Commit changes:
git commit -am 'Add new feature' - Push to branch:
git push origin feature/new-feature - Submit Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- LangChain for the excellent AI framework
- Hugging Face for democratizing AI models
- Facebook AI for the FAISS library
- Brainlox for the course data
- GitHub: @Anshuman-git-code
- Email: [email protected]
⭐ Star this repository if you found it helpful!
🐛 Found a bug? Create an issue
🚀 Want to contribute? Check our guidelines