Skip to content

This repository demonstrates the evolution of conversational AI systems before Transformers, built step-by-step to expose why modern LLMs exist.

Notifications You must be signed in to change notification settings

Tanish-Sarkar/chatbot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Python NLP Status License

🧠 Conversational Chatbot — From Rules to Attention

(Pre-Transformer NLP Systems)

This repository demonstrates the evolution of conversational AI systems before Transformers, built step-by-step to expose why modern LLMs exist.

Instead of starting with Hugging Face models, this project reconstructs the failures and breakthroughs that led to Transformers — from deterministic rules to neural attention.

Focus: Concepts, architecture, and learning — not production polish.


🚀 What This Project Shows (TL;DR)

  • How rule-based chatbots work — and why they fail
  • How Seq2Seq (LSTM Encoder–Decoder) improved things — and why it still failed
  • How Attention solved the core bottleneck
  • Why Transformers were inevitable

This is a foundational NLP project, not a demo chatbot.


📁 Repository Structure

project-chatbot/
│
├── src/
│   ├── rule_based/
│   │   ├── intents.json
│   │   ├── chatbot.py
│   │   └── serve.py
│   │
│   └── seq2seq/
│       ├── data/
│       │   └── conversations.txt
│       ├── dataset.py
│       ├── model.py
│       ├── train.py
│       └── chat.py
│
├── requirements.txt
└── README.md

Stage 1 — Rule-Based Chatbot

What was built

  • Intent-based chatbot using pattern matching
  • Predefined responses
  • Fallback handling
  • Simple session memory

What it demonstrates

  • How early chatbots worked in production

  • Why rule systems are:

    • brittle
    • hard to scale
    • expensive to maintain

Run

python -m src.rule_based.serve

Stage 2 — Seq2Seq Chatbot (Encoder–Decoder)

What was built

  • LSTM Encoder–Decoder architecture
  • Teacher forcing during training
  • Token handling (<sos>, <eos>, <unk>)
  • Step-by-step decoding

What it demonstrates

  • The context bottleneck problem
  • Why compressing a sentence into one vector fails
  • Why early neural chatbots produced vague or repetitive responses

Train

python -m src.seq2seq.train

Chat

python -m src.seq2seq.chat

Stage 3 — Seq2Seq + Attention (Bahdanau)

What changed

  • Encoder returns all hidden states
  • Decoder uses Bahdanau (additive) attention
  • Dynamic context vectors per decoding step

What it demonstrates

  • Why attention was a breakthrough
  • How word-level alignment improves generation
  • Why attention is the core idea behind Transformers

Output quality is intentionally limited due to small data and RNN constraints — this is a learning project, not a production system.


⛔ Known Limitations (Intentional)

  • Repetitive responses
  • Weak generalization
  • Small dataset
  • No beam search or decoding tricks

These are not bugs — they are the historical reasons Transformers replaced RNN-based models.


🧠 Learning Timeline (How This Project Was Built)

Phase 1 — Deterministic NLP

  • Intent classification
  • Rule-based dialogue flow
  • Failure modes of handcrafted systems

Phase 2 — Neural Dialogue

  • Encoder–Decoder intuition
  • Teacher forcing
  • Exposure bias
  • Context bottleneck

Phase 3 — Attention

  • Relevance scoring (“energy”)
  • Dynamic context vectors
  • Decoder alignment with encoder outputs

Outcome

  • Clear understanding of:

    • why Seq2Seq failed
    • why attention fixed it
    • why Transformers exist

🧰 Requirements

torch>=2.0.0
numpy>=1.23.0

Install:

pip install -r requirements.txt

🎓 Key Takeaway

Most people use Transformers. Very few understand why they were needed.

This project ensures that gap is closed.


👤 Author

Tanish Sarkar Pre-Transformer NLP Projects

About

This repository demonstrates the evolution of conversational AI systems before Transformers, built step-by-step to expose why modern LLMs exist.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages