Analyzing customer purchasing patterns and revenue trends to support data-driven business decisions using SQL, Python, and Tableau.
- Overview
- Business Problem
- Dataset
- Tools & Technologies
- Project Structure
- Data Cleaning & Preparation
- Exploratory Data Analysis (EDA)
- Research Questions & Key Findings
- Dashboard
- Dashboard Preview Images
- How to Run This Project
- Final Recommendations
- Author & Contact
This project analyzes e-commerce transactional data to understand customer purchasing behavior, revenue distribution, and product category performance.
A complete analytics workflow was implemented using SQL for data preparation, Python for validation and statistical checks, and Tableau for interactive visualization dashboards.
The objective of the project is to convert raw operational data into actionable business insights that support marketing strategy, inventory planning, and revenue optimization.
E-commerce businesses generate large volumes of transactional data but often struggle to extract meaningful insights from it.
This project addresses the following business challenges:
- Identify high-value customers contributing significantly to revenue
- Understand category-level product performance
- Analyze country-wise revenue distribution
- Detect seasonal revenue trends and growth patterns
- Transform raw sales data into strategic insights for decision making
The goal is to enable data-driven business strategy and improve operational efficiency.
The dataset used in this project contains transactional e-commerce records.
Source: Kaggle
Records: 700 rows
Variables: 12 columns
Format: CSV
Key variables include:
- Invoice Number
- Unit Price
- Quantity
- Order Value
- Country
- Invoice Date
- Major Category
- Minor Category
- Product Description
These variables enable customer, product, and geographical sales analysis.
The project uses a modern data analytics stack:
- Data cleaning and data validation
- Business queries for analytical insights
- Aggregations using SUM(),COUNT() and GROUP BY
- Pandas
- NumPy
- Matplotlib
- Seaborn
- IQR method for outlier detection
- Interactive dashboard
- KPI visualization
- Category and country insights
- Initial data exploration
- Project version control
Ecommerce_Sales_Analytics_Project
│
├── README.md
├── E-Commerce Customer Report.pdf
├── E-Commerce_Presentation.pdf
│
├── notebooks
│ ├── data_cleaning_and_eda.ipynb
│ └── outlier_detection_iqr.ipynb
│
├── sql
│ └── ecommerce_analysis_queries.sql
│
├── dashboard
│ └── ecommerce_sales_dashboard.twb
│
└── dataset
└── ecommerce_dataset.csv
The dataset was processed through multiple validation and cleaning steps.
Key steps performed:
- Verified total records using SQL COUNT()
- Checked duplicate transactions using GROUP BY and HAVING
- Confirmed no NULL values exist in the dataset
- Ensured correct data types and schema consistency
Created a Revenue column using: Revenue = Unit Price × Quantity
These steps ensured the dataset was clean and ready for reliable analysis.
Exploratory analysis was conducted to understand sales patterns and customer purchasing behavior.
- Identified overall revenue spread across transactions
- Detected high-value purchase ranges
- Evaluated variability in order values
- Outliers were identified using the IQR (Interquartile Range) method to ensure reliable analysis
- A total of 91 outliers were detected in the order value distribution
- Compared revenue across product categories
- Identified Kitchen & Garden as dominant categories
- Evaluated revenue contribution by country
- Identified key international markets
- Identified high-value customers
- Analyzed purchasing patterns across categories
These analyses provided the foundation for business insights and dashboard creation.
Total Revenue: $22.4K
Total Orders: 549
Average Order Value (AOV): $40.77
- Kitchen and Garden categories dominate revenue share
- Sales follow a semi-Pareto pattern, where a few categories contribute most revenue.
- Australia (~$6.3K) is the highest revenue contributor
- France and United Kingdom are also major markets
- Hong Kong shows the highest Average Order Value (~$237)
- Revenue shows seasonal volatility
- Peak months exceed $2.5K
- Some months show MoM growth above 300%
A small set of top-selling products generates a large share of revenue.
These findings highlight revenue concentration patterns and opportunities for business optimization.
The interactive Tableau dashboard provides a business-level overview of the e-commerce dataset.
Dashboard features include:
- Revenue KPIs
- Monthly revenue trends
- Category performance analysis
- Country-wise revenue insights
- Product-level performance
- Customer purchasing behavior analysis
Key metrics visualized:
- Total Revenue
- Average Order Value
- Total Orders
- Total Quantity Sold
git clone https://github.com/yourusername/ecommerce-sales-analysis.git
Place the CSV dataset into the dataset/ folder.
Execute SQL scripts to perform data cleaning and business analysis queries.
Open the notebooks and run:
- Data cleaning
- IQR outlier detection
- Exploratory data analysis
Focus on high-performing categories such as Kitchen and Garden.
Invest more in Australia and European markets.
Develop premium offerings for Hong Kong customers due to high AOV.
Use historical revenue trends to improve inventory planning and demand forecasting.
Focus on high-value customers through loyalty programs and targeted marketing campaigns.
These strategies can significantly improve revenue growth and operational efficiency.
Aditya Raj
B.Tech Computer Science Engineering (3rd Year)
Aspiring Data Analyst
📧 Email: [email protected] 🔗 LinkedIn: www.linkedin.com/in/ aditya-raj-a734b1279 💻 *GitHub:*https://github.com/Adityaraj042003


