The aim of this fork is to improve original starter project code for students taking Intro to Machine Learning on Udacity with python 3.8, conda managing and jupyter notebooks.
- Lesson 2: Naive Bayes
- Lesson 3: SVM
- Lesson 4: Decision Trees
- Lesson 5: Choose Your own Algorithm
- Lesson 6: Datasets and Questions
- Lesson 7: Regressions
- Lesson 8: Outliers
- Lesson 9: Clustering
- Lesson 10: Feature Scaling
- Lesson 11: Text Learning
- Lesson 12: Feature Selection
- Lesson 13: PCA
- Lesson 14: Validation
- Lesson 15: Evaluation Metrics
- Lesson 17: Final Project
In this repo newer version of scikit-learn
is used. Thus, to get the results expected by the course grader
you need to use SVC
with gamma='auto'
, since the default value of gamma changed, see sklearn.svm.SVC
docs:
Changed in version 0.22: The default value of gamma changed from 'auto' to 'scale'.
For example:
clf = SVC(kernel='linear', gamma='auto')
To get the correct (acceptable by grader) results set sort_keys='../utils/python2_lesson06_keys.pkl'
for
feature_format
function:
...
data = feature_format(dictionary, features_list, remove_any_zeroes=True, sort_keys='../utils/python2_lesson06_keys.pkl')
...
[...] This will open up a file in the tools folder with the Python 2 key order.
See this for detailed explanation.
$ git clone https://github.com/trsvchn/ud120-projects-py3-jupyter.git
$ cd ud120-projects-v2
$ conda env create -f environment.yml
$ conda activate ud120
$ python ./utils/starter.py