ä»åã¯ãKaggleのWinner solutionã«ããªã£ããKè¿åãç¨ããç¹å¾´éæ½åºããç´¹ä»ãã¾ãã
Rでの実装は公開されていますããPythonã§ã®å®è£ ã¯ç¢ºèªã§ããªãã£ãã®ã§ãèªåã®Pythonå®è£ ãå ¬éãã¦ãã¾ãã
ã¢ã«ã´ãªãºã ã®æ¦è¦
è¿åæ°ããåé¡ããã¯ã©ã¹æ°ãã¨ããå ´åã«ãã¢ã«ã´ãªãºã ã¯åã®ç¹å¾´éãçæãã¾ããçæãããç¹å¾´éã¯ä¸è¨ã®ããã«ã観測å¤ã¨åã¯ã©ã¹å ã®æè¿åç¹ã¨ã®éã®è·é¢ããè¨ç®ããã¾ãã
- ã¨ããã¯ã©ã¹ã«å±ããè¨ç·´ãã¼ã¿ã®ä¸ã®ç¬¬1è¿åã¾ã§ã®è·é¢ã1ã¤ç®ã®ç¹å¾´éã¨ãã
- ã¨ããã¯ã©ã¹ã«å±ããè¨ç·´ãã¼ã¿ã®ä¸ã®ç¬¬2è¿åã¾ã§ã®è·é¢ã®åã2ã¤ç®ã®ç¹å¾´éã¨ãã
- ã¨ããã¯ã©ã¹ã«å±ããè¨ç·´ãã¼ã¿ã®ä¸ã®ç¬¬3è¿åã¾ã§ã®è·é¢ã®åã3ã¤ç®ã®ç¹å¾´éã¨ãã
- 以ä¸ãã«é¢ãã¦ç¹°ãè¿ã
ä¸è¨ã®æé ãå ¨ã¦ã®ã¯ã©ã¹ã«ã¤ãã¦ç¹°ãè¿ããã¨ã§ãåã®ç¹å¾´éãçæããã¾ããå®éã¯éå¦ç¿ãé¿ãããããn-foldã§åå²ãã¦ç¹å¾´éãçæãã¦ãã¾ãã
Pythonã§ã®ä¾
Notebook version can be seen [here](https://github.com/upura/knnFeat/blob/master/demo.ipynb).
å¯è¦åã®ããã®ããã±ã¼ã¸èªã¿è¾¼ã¿
import numpy as np %matplotlib inline import matplotlib.pyplot as plt
ãµã³ãã«ãã¼ã¿ã®çæ
x0 = np.random.rand(500) - 0.5 x1 = np.random.rand(500) - 0.5 X = np.array(list(zip(x0, x1))) y = np.array([1 if i0 * i1 > 0 else 0 for (i0, i1) in list(zip(x0, x1))])
å¯è¦å
Kè¿åãç¨ããç¹å¾´éæ½åº
from knnFeat import knnExtract newX = knnExtract(X, y, k = 1, folds = 5)
å¯è¦å
ã»ã¼ç·å½¢åé¢å¯è½ãªç¹å¾´éãæ½åºã§ãã¦ãã¾ãã
iris ã§ã®ä¾
å®çªã®iris ãã¼ã¿ã»ããã§ãããã¾ãåé¢ã§ãã¦ãã¾ãï¼irisã¯3ã¯ã©ã¹åé¡ãªã®ã§ç¹å¾´ã¯3ã¤å¾ããã¦ãã¾ããããã¡2ã¤ã ãã§ãããããã¦ãã¾ãï¼ã
from sklearn import datasets iris = datasets.load_iris() y = iris.target X = iris.data
追è¨20180624
å®è£ ã®ä¿®æ£
ååå ¬éæãn-foldã§åå²ã«ããã¦X_testã§ã¯ãªãX_trainããç¹å¾´éãæ½åºããå®è£ ã¨ãªã£ã¦ãã¾ããããææããã ãããããã¨ããããã¾ãã