Skip to content

Anshuman-git-code/Resume-Parser-Skill-Matcher

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Resume Parser & Job Matching System

An AI-powered serverless application that automatically extracts information from resumes and matches candidates with relevant job opportunities using AWS services.

RPM.mov

AWS Python Status

🎯 Overview

This system automates the resume screening process by:

  • Extracting text from PDF resumes using AWS Textract (99% accuracy)
  • Parsing skills, experience, and contact information
  • Matching candidates with jobs based on skill overlap
  • Providing a user-friendly dashboard for recruiters

Processing Time: 8-10 seconds end-to-end | Cost: ~$0.01 per resume


✨ Features

  • 📄 Automated Text Extraction - AWS Textract extracts text with 99% confidence
  • 🧠 Intelligent Skill Parsing - Identifies skills, experience, and contact info
  • 🎯 Smart Job Matching - Calculates match scores with skill gap analysis
  • 📊 Recruiter Dashboard - Clean UI for viewing candidates and matches
  • Real-time Processing - Event-driven architecture with auto-triggers
  • 🔄 Fully Automated - No manual intervention required
  • 📈 Scalable - Serverless architecture handles unlimited concurrent uploads

🏗️ Architecture

Screenshot 2025-11-23 at 8 41 56 AM
User Upload (PDF)
    ↓
API Gateway (/upload)
    ↓
Upload Handler Lambda → S3 (resumes/raw/)
    ↓ [S3 Event Trigger]
Textract Processor Lambda → S3 (resumes/processed/)
    ↓ [S3 Event Trigger]
Skill Matcher Lambda → DynamoDB (CandidateProfiles)
    ↓ [DynamoDB Stream Trigger]
Job Matcher Lambda → DynamoDB (Matches)
    ↓
Dashboard (API Gateway /candidates & /matches)

🛠️ Tech Stack

Frontend:

  • HTML5, CSS3, JavaScript (Vanilla)
  • Responsive design

Backend:

  • AWS Lambda (Python 3.9)
  • AWS API Gateway (REST API)
  • AWS S3 (Object Storage)
  • AWS DynamoDB (NoSQL Database)
  • AWS Textract (AI Text Extraction)

DevOps:

  • AWS IAM (Access Management)
  • AWS CloudWatch (Logging & Monitoring)
  • Git/GitHub (Version Control)

🚀 Quick Start

Prerequisites

  • AWS Account with appropriate permissions
  • AWS CLI configured
  • Python 3.9+
  • Git

1. Clone the Repository

git clone https://github.com/Anshuman-git-code/Resume-Parser-Skill-Matcher.git
cd Resume-Parser-Skill-Matcher

2. Deploy AWS Infrastructure

Create S3 Bucket:

bash scripts/setup-s3.sh

Deploy Lambda Functions:

bash scripts/deploy-lambdas.sh

Setup API Gateway:

bash scripts/setup-api-gateway.sh

3. Open the Dashboard

open frontend/dashboard/index.html

Or start a local server:

cd frontend/dashboard
python3 -m http.server 8000
# Open http://localhost:8000

📖 Usage

Upload a Resume

  1. Open the dashboard
  2. Click "Choose File" and select a PDF resume
  3. Wait 8-10 seconds for processing
  4. View extracted text in the popup modal

View Candidates

  1. Click the "Candidates" tab
  2. Browse all processed candidates
  3. Search by name or skills
  4. Click "View Matches" for any candidate

View Job Matches

  1. Click the "Matches" tab
  2. Select a candidate from the dropdown
  3. See ranked job matches with scores
  4. Review matched and missing skills

📁 Project Structure

Resume-Parser-Skill-Matcher/
├── backend/
│   ├── lambda-functions/
│   │   ├── upload-handler/          # Handles resume uploads
│   │   ├── textract-processor/      # Extracts text from PDFs
│   │   ├── skill-parser/            # Parses skills and experience
│   │   ├── job-matcher/             # Matches candidates with jobs
│   │   ├── get-candidates/          # API to fetch candidates
│   │   └── get-matches/             # API to fetch job matches
│   ├── tests/                       # Test files
│   └── utils/                       # Utility functions
├── frontend/
│   ├── dashboard/
│   │   ├── index.html               # Main dashboard
│   │   ├── app.js                   # Business logic
│   │   ├── styles.css               # Styling
│   │   └── check-status.html        # Status checker
│   └── assets/                      # Static assets
├── scripts/
│   ├── setup-s3.sh                  # S3 bucket setup
│   ├── deploy-lambdas.sh            # Lambda deployment
│   ├── setup-api-gateway.sh         # API Gateway setup
│   ├── trigger-job-matching.sh      # Manual match trigger
│   ├── validate-setup.sh            # Validate AWS setup
│   ├── test-upload.py               # Test upload script
│   └── create-dateutil-layer.sh     # Create Lambda layer
├── docs/
│   └── handoff-notes/               # Documentation
├── PROJECT_REPORT.md                # Comprehensive project report
├── requirements.txt                 # Python dependencies
└── README.md                        # This file

🔧 Configuration

API Endpoints

Base URL: https://n8ujlpobn6.execute-api.ap-south-1.amazonaws.com/prod

  • POST /upload - Upload resume
  • GET /candidates - Fetch all candidates
  • GET /matches?candidate_id=xxx - Fetch job matches

Environment Variables

Update these in frontend/dashboard/app.js:

const API_ENDPOINT = 'YOUR_API_GATEWAY_URL/prod/upload';
const CANDIDATES_API = 'YOUR_API_GATEWAY_URL/prod/candidates';
const MATCHES_API = 'YOUR_API_GATEWAY_URL/prod/matches';

📊 Performance Metrics

Metric Value
Text Extraction Time 3-4 seconds
Skill Parsing Time 1-2 seconds
Job Matching Time 1-2 seconds
Total Processing Time 8-10 seconds
Extraction Accuracy 99%
Cost per Resume ~$0.01
Concurrent Uploads Unlimited

🧪 Testing

Run Validation Script

bash scripts/validate-setup.sh

Test Upload

python3 scripts/test-upload.py path/to/resume.pdf

Manual Testing

  1. Upload a test resume through the dashboard
  2. Check S3 bucket for uploaded files
  3. Verify DynamoDB tables have data
  4. Check CloudWatch logs for any errors

🎓 Sample Data

The system comes with sample data:

  • 9 candidates processed
  • 10+ jobs in database
  • 100+ matches generated

Sample candidate:

{
  "resume_id": "818cee79-8493-4f9a-ae57-b9af695e82b2",
  "name": "TANNU NAGAR",
  "email": "[email protected]",
  "skills": ["docker", "kubernetes", "python", "aws"],
  "total_experience_years": 5.2
}

🚧 Current Limitations

  • Only supports PDF files (no Word documents)
  • Basic pattern matching for skill extraction
  • Simple skill overlap matching algorithm
  • No user authentication
  • No duplicate resume detection
  • Limited to predefined skill list

See PROJECT_REPORT.md for detailed limitations.


🔮 Future Enhancements

Phase 1 (1-2 months)

  • AWS Comprehend integration for better NLP
  • Support for Word documents and images
  • User authentication with AWS Cognito
  • Enhanced matching algorithm

Phase 2 (3-6 months)

  • Recruiter dashboard with advanced features
  • Candidate portal
  • Email/SMS notifications
  • Analytics and reporting

Phase 3 (6-12 months)

  • Machine learning models
  • Multi-language support
  • Mobile application
  • ATS integrations

See PROJECT_REPORT.md for complete roadmap.


💰 Cost Analysis

Manual Process

  • Time per resume: 5-10 minutes
  • Cost per resume: $2.50-8.33

Automated System

  • Processing time: 10 seconds
  • Cost per resume: ~$0.01

Savings: 99% cost reduction | $30,000-100,000 annually for 1000 resumes/month


👥 Team

  • Anshuman Mohapatra - Team Lead, Backend Pipeline & APIs
  • Shivam - Skill Parsing & Data Extraction
  • Siddhant - Job Matching Algorithm
  • Pavan - Frontend Dashboard UI

Mentor: Kabir


📝 Documentation


🐛 Troubleshooting

Dashboard not loading?

# Hard refresh browser
Cmd+Shift+R (Mac) or Ctrl+Shift+R (Windows)

Upload failing?

  • Check internet connection
  • Verify AWS credentials
  • Check CloudWatch logs
  • Ensure S3 bucket exists

No candidates showing?

  • Refresh the page
  • Check DynamoDB tables
  • Verify API Gateway endpoints
  • Check browser console for errors

📄 License

This project is part of an academic assignment.


🙏 Acknowledgments

  • AWS for providing cloud infrastructure
  • AWS Textract for AI-powered text extraction
  • Our mentor Kabir for guidance and support

📞 Contact

For questions or support, please contact the team members through GitHub.


🌟 Show Your Support

If you found this project helpful, please give it a ⭐️!


Built with ❤️ using AWS Serverless Architecture

About

An AI-powered serverless application that automatically extracts information from resumes and matches candidates with relevant job opportunities using AWS services.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors