An automobile company has plans to enter new markets with their existing products.
In their existing market, the sales team has classified all customers into 4 segments (A, B, C, D ). Then, they performed segmented outreach and communication for different segment of customers.
I'm required to help the manager to predict the right group of the new customers.
- Perform an Exploratory Data Analysis with visualization
- Use the supervised machine learning models to predict the customer segmentation.
- Use unsupervised learning model K Nearest Neighbors to create new clusters.
- Create Personas of the new clusters.
- Pandas
- Numpy
- Matplotlib
- Seaborn
- Time
- scikit-learn
- Supervised learning models for classfication:
- Support Vector Machine
- Gradient Boosting Classifier
- Light Gradient Boosting Classifier
- Ada Boost
- Cat Boost
- Decision Tree
- Random Forest
- Logistic Regression
- KNeighbors
- Naive Bayes Gaussian
- Unsupervised Learning model for clustering:
- K-Means
- Supervised learning models for classfication:
Customer Segmentation https://www.kaggle.com/vetrirah/customer
The best performers modesl are:
-
Gradient Boosting Classifier:
-
Light Gradient Boosting Classifier:
After clustering the datapoints in four clusters I came up with the below Personas:
-
Exploratory Data Analysis
- Clean the dataset
- Create visualizations
-
Feature Engineering
- Create Dummies.
- Scaling
- Feature selection SFS and RFE
- PCA
-
Modeling for Classification. we select the model with better performance (Gradient Boosting Classifier - Light Gradient Boosting Classifier)
-
Hyper Tunning of the Gradient Boosting Classifier and Light Gradient Boosting Classifier.
-
Clustering with Kmeans
-
Create Personas
-
Create Story telling
To see the presentation, click in the below picture.