Skip to content

Explorartory Data Analysis with visualization. Use the supervised machine learning models to predict the customer segmentation. Use unsupervised learning model K Nearest Neighbors to create new clusters. Create Personas of the new clusters.

Notifications You must be signed in to change notification settings

isra-st/Customer_Segmentation

Repository files navigation

Customer_Segmentation

An automobile company has plans to enter new markets with their existing products.

In their existing market, the sales team has classified all customers into 4 segments (A, B, C, D ). Then, they performed segmented outreach and communication for different segment of customers.

I'm required to help the manager to predict the right group of the new customers.

Goals of the project

  1. Perform an Exploratory Data Analysis with visualization
  2. Use the supervised machine learning models to predict the customer segmentation.
  3. Use unsupervised learning model K Nearest Neighbors to create new clusters.
  4. Create Personas of the new clusters.

Tools used

  • Pandas
  • Numpy
  • Matplotlib
  • Seaborn
  • Time
  • scikit-learn
    • Supervised learning models for classfication:
      • Support Vector Machine
      • Gradient Boosting Classifier
      • Light Gradient Boosting Classifier
      • Ada Boost
      • Cat Boost
      • Decision Tree
      • Random Forest
      • Logistic Regression
      • KNeighbors
      • Naive Bayes Gaussian
    • Unsupervised Learning model for clustering:
      • K-Means

Resources

Customer Segmentation https://www.kaggle.com/vetrirah/customer

Classification models performance

The best performers modesl are:

  • Gradient Boosting Classifier:

    • Accuaracy: 53.62% - Preccision: 52.77% - F1 score: 53% - Recall: 52.49%

      Gradient Boosting Classifier
  • Light Gradient Boosting Classifier:

    • Accuaracy: 52% - Preccision: 49.86% - F1 score: 49.75% - Recall: 50%

      Gradient Boosting Classifier

K-Means cluster Personas creation

After clustering the datapoints in four clusters I came up with the below Personas:

Customer Segmentation Personas

Process

  1. Exploratory Data Analysis

    • Clean the dataset
    • Create visualizations
  2. Feature Engineering

    • Create Dummies.
    • Scaling
    • Feature selection SFS and RFE
    • PCA
  3. Modeling for Classification. we select the model with better performance (Gradient Boosting Classifier - Light Gradient Boosting Classifier)

  4. Hyper Tunning of the Gradient Boosting Classifier and Light Gradient Boosting Classifier.

  5. Clustering with Kmeans

  6. Create Personas

  7. Create Story telling

Presentation

To see the presentation, click in the below picture.

Customer Segmentation_Github

About

Explorartory Data Analysis with visualization. Use the supervised machine learning models to predict the customer segmentation. Use unsupervised learning model K Nearest Neighbors to create new clusters. Create Personas of the new clusters.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published