E-Commerce RFM Customer Segmentation Analysis Overview
This project performs a full customer value analysis using the RFM (Recency, Frequency, Monetary) framework on an e-commerce transactional dataset. The goal is to identify high-value customers, understand revenue distribution, and provide insights that support targeted marketing and retention strategies.
Objectives
Clean and prepare the raw transactional data
Compute RFM metrics for every customer
Segment customers using standard RFM scoring
Visualize customer distribution and revenue contribution
Produce strategic recommendations based on analytical findings
Compile results into a clear, professional report
Dataset
The dataset contains order records from 2010–2011, including:
Invoice numbers
Product descriptions
Quantities
Unit prices
Customer IDs
Timestamps
Country information
All calculations of revenue and RFM scoring are based on this dataset.
Methodology
RFM scoring was applied by ranking each customer on:
Recency: How recently they purchased
Frequency: How often they purchased
Monetary: How much they spent
Scores range from 1 to 5 for each category. Segment classification follows common combinations such as Champions, Loyal Customers, Potential Loyalists, At-Risk, Hibernating, and Low-Value groups.
Key Analysis Steps
Data cleaning and handling missing values
RFM metric computation
Segment assignment using score combinations
Revenue and customer distribution analysis
Seasonal and revenue-trend visualisation
Generation of summary insights and recommendations
Insights Summary
Champions (about 25 percent of customers) generate approximately 66.5 percent of total revenue.
Champions and Loyal Customers together contribute over 80 percent of revenue.
Low-value segments form most of the customer base but less than 20 percent of revenue.
Clear seasonality is visible, with strong peaks during November–December.
These findings highlight the importance of retention strategies for high-value customers and targeted re-engagement efforts for at-risk segments.
Contents
data/ – raw and cleaned datasets
scripts/ – analysis and segmentation code
charts/ – visualisations generated during analysis
report/ – final written report and PDF output
README.md – project documentation
Tools and Technologies
Python
Pandas
Matplotlib
Seaborn
Jupyter Notebook or script-based analysis
Git and GitHub for version control
Author
Rajab Cheruiyot Bett Analysis and reporting completed in 2025.