Skip to content

Commit 6add207

Browse files
committed
README update
1 parent 174fea1 commit 6add207

File tree

10 files changed

+1
-1053
lines changed

10 files changed

+1
-1053
lines changed

README.md

Lines changed: 1 addition & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# Python Data Science
22

33
## Description
4-
A collection of data science scripts for data analysis in Python.
4+
A collection of data science scripts for data analysis in Python. Please also see my related repository [Python Machine Learning](https://github.com/GeorgeSeif/Python-Machine-Learning) which contains many implementations of Machine Learning algorithms including _regression_, _classification_, and _clustering_. The algorithms are implemented in two ways: from scratch in Python and using Scikit Learn functions.
55

66
Python libraries used:
77
- Numpy
@@ -26,10 +26,6 @@ To install all of the libraries, run the commands in the "install.txt" file. The
2626
- **explore_wine_data.py:** Exploratory data analysis of the wine dataset from sklearn using visualisations. Includes data analysis using histogram, scatterplot, bee swarm plot, and cumulative distribution function.
2727
- **statistics_iris.py:** Compute various statistics of the iris dataset features such as histogram, min, max, median, mean, and variance.
2828
- **covariance_boston.py:** Compute the covariance matrix of the Boston Housing dataset. These matrices can sometimes give faster insight into which variables are related rather than creating scatter plots.
29-
- **linear_regression.py:** Linear regression on the Boston Housing dataset. Includes data shuffling and normalization. Includes an implementation from scratch and Sklearn.
30-
- **logistic_regression.py:** Logistic regression on the wine dataset. Includes data shuffling and normalization. Includes an implementation from scratch and Sklearn.
31-
- **pca_logistic_regression.py:** Logistic regression with Principal Component Analysis (PCA) for dimensionality reduction on the wine dataset. Includes data shuffling and normalization. Includes an implementation from scratch and Sklearn.
32-
- **kmeans.py, kmediods.py, k_nearnest_neighbor.py, mean_shift.py, dbscan.py:** Different clustering methods applied to the iris dataset. Includes data shuffling and normalization. Includes an implementation from scratch and Sklearn.
3329

3430
## Information
3531

dbscan.py

Lines changed: 0 additions & 150 deletions
This file was deleted.

helpers.pyc

-3.1 KB
Binary file not shown.

k_nearest_neighbor.py

Lines changed: 0 additions & 114 deletions
This file was deleted.

0 commit comments

Comments
 (0)