sklearnã§ææ¸ãæåèªè
Python Advent Calendar 2013 - Adventarã20æ¥ç®ã®è¨äºã§ãã
pythonã®æ©æ¢°å¦ç¿ã©ã¤ãã©ãªã®sklearnã便å©ã
ããã¥ã¡ã³ããå
å®ãã¦ãããæ©æ¢°å¦ç¿ã®ã¢ã«ã´ãªãºã ã«è©³ãããªãã¦ãæ軽ã«ä½¿ããã®ã§ãåãããªããåå¼·ã§ããã
â sklearnã§ã§ãããã¨(ä¸é¨)
ãã¼ã¿ã»ãã | sklearn.datasets |
---|---|
ã¯ã©ã¹ã¿ãªã³ã° | sklearn.cluster |
ã¯ãã¹ããªãã¼ã·ã§ã³ | sklearn.cross_validation |
è¡åã®å解(PCAç) | sklearn.decomposition |
ã¢ã³ãµã³ãã«å¦ç¿ | sklearn.ensemble |
ç·å½¢ã¢ãã« | sklearn.linear_model |
ã¢ãã«è©ä¾¡ | sklearn.metrics |
è¿åæ³ | sklearn.neighbors |
SVM | sklearn.svm |
決å®æ¨ | sklearn.tree |
ä»åã¯ãã¼ã¿ã»ããã®ææ¸ãæåããSVMã¨ã使ã£ã¦ç»åèªèããã£ã¦ã¿ãã
pythonã®æ©æ¢°å¦ç¿ã©ã¤ãã©ãªsklearnã§ææ¸ãæåã®ç»åèªèããã£ã¦ã¿ãã
ææ¸ãæåã®æºå
ã¯ãã¹ããªãã¼ã·ã§ã³
SVM
æ¤è¨¼
ææ¸ãæåã®æºå
ã¾ãã¯ãsklearnã®ãã¼ã¿ã»ããããææ¸ãæåããã¼ãããããã¼ã¿ã»ããã¯8x8ã®ç»åã1797æãããããã«0ã9ã®ã©ãã«ãã¤ãã¦ããã
from sklearn import datasets images = datasets.load_digits() # å¦ç¿ãã¼ã¿ data = images.data # æ師ãã¼ã¿(ã©ãã«) target = images.target
images.dataã¯64åã®æ°å¤ãå ¥ã£ãlistã«ãªã£ã¦ãããã©ãimages.imagesã¯8x8ã®listãªã®ã§ãpylabã®imshowã使ãã¨ç»åã表示ã§ããã
imshow(images.images[0], cm.gray_r, interpolation='NONE')
âåºå
ã¯ãã¹ããªãã¼ã·ã§ã³
å¦ç¿ãã¼ã¿ã¨æ師ãã¼ã¿ããã¼ãã§ããã®ã§ããã¬ã¼ãã³ã°ç¨ã¨ãã¹ãç¨ã®ãã¼ã¿ãæºåãããfrom sklearn import cross_validation # ãã¬ã¼ãã³ã°ãã¼ã¿ã¨ãã¹ããã¼ã¿ã®æºåã X_train, X_test, y_train, y_test = cross_validation.train_test_split(data, target, test_size=0.2, random_state=0)
å ¨ä½ã®ãã¼ã¿ã®2å²ãæ¤è¨¼ç¨ã«ä½¿ãããã®ãããã®ãã¼ã¿ã®åãæ¹ããããªã«ãã£ã¦ãããã
SVM
å¤å¥å¨ã¯SVMã使ããå ã»ã©ä½ã£ãX_trainãy_trainã§å¦ç¿ããã¦ãX_testã®ã©ãã«ãäºæ¸¬ãããfrom sklearn import svm # å¤å¥å¨ã¯SVMã使ãã classifier = svm.SVC(C=1.0, kernel='linear') # å¦ç¿ fitted = classifier.fit(X_train, y_train) # äºæ¸¬ predicted = fitted.predict(X_test)
æ¤è¨¼
sklearn.metricsã使ã£ã¦æ¤è¨¼ãããfrom sklearn import metrics
confusion_matrix()ã§äºæ¸¬ããpredictedã¨æ£è§£ãã¼ã¿ã®y_testã§ã¯ãã¹ãã¨ãã
print metrics.confusion_matrix(predicted, y_test) >> [[27 0 0 0 0 0 0 0 0 0] [ 0 34 0 0 0 0 1 0 1 0] [ 0 0 36 0 0 0 0 0 1 0] [ 0 0 0 29 0 0 0 0 0 1] [ 0 0 0 0 30 0 0 1 0 0] [ 0 0 0 0 0 39 0 0 0 1] [ 0 0 0 0 0 0 43 0 0 0] [ 0 0 0 0 0 0 0 38 0 0] [ 0 1 0 0 0 0 0 0 37 0] [ 0 0 0 0 0 1 0 0 0 39]]
ã¹ã³ã¢ã®è¨ç®ã¯ãaccuracy_score()ã使ãã
metrics.accuracy_score(predicted, y_test)
>>
0.97777777777777775
kaggleã§ç«ã¨ç¬ã®ç»åèªèãã£ã¦ãããææ¦ãã¦ã¿ãããªã¼ã