æ©æ¢°å¦ç¿PodcastãTWiMLï¼AIãã§å é±åãä¸ããããå¯è¦åã©ã¤ãã©ãªãYellowbrickããé常ã«ä¾¿å©ã ã£ãã®ã§ç´¹ä»ãã¾ãï¼ã¡ãªã¿ã«Podcastã«ã¯ä½è ã®1人ã§ããRebecca Bilbroãããåºæ¼ãã¦ããã®ã§èå³æã£ãæ¹ã¯æ¯éèãã¦ã¿ã¦ãã ããã
Yellowbrickã¨ã¯
ä¸è¨ã§è¨ãã¨ãæ©æ¢°å¦ç¿ã«ç¹åããå¯è¦åã©ã¤ãã©ãªã§ããå®è£ çãªé¢ã§è¨ãã¨(ãã¡ãã®æ¹ãããããããããããã¾ãã)ãscikit-learnã¨matplotlibãã©ãããã¦ãscikit-learnã©ã¤ã¯ãªAPIã§ä½¿ããã¨ãã§ãããã®ã§ãã
ä¾ãã°ç¸é¢è¡åã®ãã¼ããããããããããããå ´åã¯æ¬¡ã®ããã«æ¸ãã ãã§ã°ã©ããä½ããã¨ãã§ãã¾ãã
visualizer = Rank2D(features=features, algorithm='pearson')
visualizer.fit(X, y)
visualizer.transform(X);
ãªãã¨ããã£ã3è¡ã§æ¸ããã¨ãã§ãã¦ãã¾ãã¾ãã
â»ä¸å¿è£è¶³ã§ãããmatplotlibãseabornã§ã¯ç¸é¢ä¿æ°ã®è¨ç®ã¯èªåã§è¡ãå¿ è¦ãããã¾ã(seabornã§æ¸ãå ´åã®ã³ã¼ã)ããããYellowbrickãè¯ãæãã«ãã£ã¦ããããã¨ãããã¨ã§ããã
Yellowbrickã®ã¢ããã¼ã·ã§ã³
Podcastå ã§Yellowbrickãä½ã£ãã¢ããã¼ã·ã§ã³ã«ã¤ãã¦è§¦ãã¦ãã¾ããã Rebeccaããã¯ãã¨ãã¨æ©æ¢°å¦ç¿ãæããæè²ã®ç¾å ´ã«ãã¦ãããã§æ©æ¢°å¦ç¿ã¢ãã«ã®å¦ç¿/äºæ¸¬ã¨åãããã ç¹å¾´éãçµæã解éãããã¨ã®éè¦æ§ ãé«ããã¨ã«æ°ã¥ãã¾ãããããã§ãæ©æ¢°å¦ç¿ã®ç¹å¾´éãçµæã®è§£éããµãã¼ãããã©ã¤ãã©ãªã¨ãã¦Yellowbrickãéçºãå§ãã¾ããã
Yellowbrickã§ã§ããå¯è¦å
Yellowbrickã§ã§ããå¯è¦åãã¾ã¨ãã¦ã¿ã¾ããã 次ã®ãããªè§£æãè¡ããã¨ãã§ãã¾ãã
- Feature Analysis Visualizers: ç¹å¾´é(説æå¤æ°)ã®å¯è¦å
- å ±åæ£è¡åãç¸é¢è¡åãRadVizãPCA...
- Target Visualizers: ç®çå¤æ°ã®å¯è¦å
- ç®çå¤æ°åå¸ã説æå¤æ°ã¨ã®ç¸é¢...
- Regression Visualizers: å帰ã¢ãã«ã®å¯è¦å
- æ®å·®ãããããã¨ã©ã¼ãããããalphaå¤é¸æ...
- Classification Visualizers: åé¡ã¢ãã«ã®å¯è¦å
- æ··åè¡åãROCAUCã精度/åç¾åº¦ã«ã¼ã...
- Clustering Visualizers: ã¯ã©ã¹ã¿ãªã³ã°ã®å¯è¦å
- Elbow methodãSilhouette Visualize...
- Model Selection Visualizers: ã¢ãã«é¸æã®å¯è¦å
- Validationã«ã¼ããLearningã«ã¼ããCrossValidation
- Text Modeling Visualizers: è¨èªã¢ããªã³ã°
- t-SNE...
ä¸è¬çãªæ©æ¢°å¦ç¿ã®åé¡ã§è¡ãã¯ã¼ã¯ããã¼ã¯ä¸éãç¶²ç¾ ããã¦ããã¨æãã¾ãã 詳ããç¥ãããæ¹ã¯ããã¥ã¡ã³ããåç §ãã¦ãã ããã
Yellowbrickã使ã£ã¦ã¿ã
ããã§ã¯å®éã«ä½¿ããªããYellowbrickã§ã®æ©è½ãè¦ã¦ããããã¨æãã¾ãã å ¨é¨ã¯ç´¹ä»ããããªãã®ã§ä¸é¨ã®ã¿ã®ç´¹ä»ã«ãªãã¾ããã¾ã使ã£ãã³ã¼ãã¯ãã¹ã¦GitHubã«ä¸ãã¦ããã¾ããè¨äºã®æå¾ã«ãªã³ã¯ãè²¼ã£ã¦ããã¾ãã®ã§èå³ããæ¹ã¯ãåç §ãã ããã
ã¤ã³ã¹ãã¼ã«
ã¤ã³ã¹ãã¼ã«ã¯pipã§è¡ãã¾ãã
pip install yellowbrick
ãã¼ã¿
ãµã³ãã«ãã¼ã¿ã¨ãã¦UCI Machine Learning Repositoryã®ãã®ã使ãã¾ãã
å®ã¯yellowbrickãããããç¨éã®ããã«ãã¦ã³ãã¼ãã¹ã¯ãªãããç¨æãã¦ããã¦ããã®ã§ãã¡ãã£ã¨è©¦ãããæ¹ã¯ãã¡ãã使ãã¨è¯ãã¨æãã¾ãã
$ python -m yellowbrick.download
åºæ¬çãªä½¿ãæ¹
åºæ¬çã«ã¯scikit-learnã¨åãAPIã§ä½¿ããã¨ãã§ãã¾ãã
ã¤ã³ã¹ã¿ã³ã¹ãçæã㦠.fit()
.transform()
ãå¼ã³åºããã¨ããæµãã§ãã
.poof()
ã¨ãã馴æã¿ã®ãªãã¡ã½ãããããã¾ããããã¡ãã¯ãã¹ãå¼æ°ã«æå®ããã°ãã¡ã¤ã«åºåãããªããã° plt.show()
ãã¦ãããAPIã®ããã§ãã
ã¾ãç¹å®ã®Classã«ã¤ãã¦ã¯legendãªã©ã表示ãã¦ããããã§åºæ¬çã«å¼ã³åºããæ¹ãç¡é£ãªããã§ãããã®è¨äºã§ã¯å¿
è¦ãªãã®ã ãå¼ã³åºãã¦ãã¾ãã
visualizer = SomeYellowbrickClass()
visualizer.fit(X, y)
visualizer.transform(X)
# visualizer.poof() # `%matplotlib inline` ãã¦ããå ´åã¯åºæ¬çã«ä¸è¦ã ãå¿
è¦ãªClassããã
Feature Analysis Visualizers: ç¹å¾´éã®å¯è¦å
RadViz
from yellowbrick.features import RadViz visualizer = RadViz(classes=classes, features=features) visualizer.fit(X, y) visualizer.transform(X);
Regression Visualizers: å帰ã¢ãã«ã®å¯è¦å
æ®å·®ãããã
.score()
ã使ããã¨ã§ãã¹ããã¼ã¿ã¨ã®æ¯è¼ãè¡ããã¨ãã§ãã¾ãã
from sklearn.linear_model import Ridge from yellowbrick.regressor import ResidualsPlot model = Ridge() visualizer = ResidualsPlot(model) visualizer.fit(X_train, y_train) visualizer.score(X_test, y_test) # ãã¹ããã¼ã¿ã¨ã®æ¯è¼ visualizer.poof(); # ResidualsPlotã¯.poof()ãå¼ã³åºããªãã¨å®å ¨ã«plotããã¾ããã§ãã(æ±)
AlphaSelection
ã¢ã«ãã¡å¤ã®æ¢ç´¢ãå¯è¦åãããã¨ãã§ãã¾ãã
from sklearn.linear_model import LassoCV from yellowbrick.regressor import AlphaSelection alphas = np.logspace(-10, 1, 400) model = LassoCV(alphas=alphas) visualizer = AlphaSelection(model) visualizer.fit(X, y) visualizer.poof()
Classification Visualizers: åé¡ã¢ãã«ã®å¯è¦å
æ··åè¡å
from yellowbrick.classifier import ConfusionMatrix from sklearn.svm import SVC svc = SVC() visualizer = ConfusionMatrix(svc, classes=classes) visualizer.fit(X_train, y_train) visualizer.score(X_test, y_test) visualizer.poof();
è²ãæ¿ãæ¹ãæ°ãå¤ããã¨ã表ãã¦ãã¾ãã
ROCã«ã¼ãã¨AUC
ROCã«ã¼ããç°¡åã«æ¸ããã¨ãã§ãã¾ãã
from yellowbrick.classifier import ROCAUC from sklearn.linear_model import LogisticRegression logistic = LogisticRegression() visualizer = ROCAUC(logistic) visualizer.fit(X_train, y_train) visualizer.score(X_test, y_test) visualizer.poof();
Text Modeling Visualizers: è¨èªã¢ããªã³ã°ã®å¯è¦å
é·ããªã£ã¦ããã®ã§æå¾ã«è¨èªã¢ããªã³ã°ã®å¯è¦åããã¦çµããã«ãããã¨æãã¾ãã å人çã«è¨èªã¢ããªã³ã°ã®åæããããã¨ãå¤ãã®ã§ãããåæã®åº¦ã«å®ååãããå¦çãä½åãæ¸ãã¦ããã®ã§ãã¨ã¦ããããããã§ãï¼
from sklearn.feature_extraction.text import TfidfVectorizer from yellowbrick.text import TSNEVisualizer tfidf = TfidfVectorizer() docs = tfidf.fit_transform(corpus.data) labels = corpus.target tsne = TSNEVisualizer() tsne.fit(docs, labels) tsne.poof();
å°ãã®Exampleã§ããããã©ãããã使ãã¦ã¼ã¹ã±ã¼ã¹ã§ããã Yellowbrickã®ä¾¿å©ããããã£ã¦ããã ããã幸ãã§ãï¼
Yellowbrickã®å°å: ã¦ããããã¹ã
Yellowbrickã¯å¯è¦åã©ã¤ãã©ãªã§ãããã¦ããããã¹ããã¡ããã¨æ¸ããã¦ãããã¨ãPodcastã§è§¦ãããã¦ãã¾ããã
ãå¯è¦åãã©ããã£ã¦ãã¹ãããã®ã ãããã¨æã£ãã®ã§ããããã®æ¹æ³ãé¢ç½ããtests/baseline_images
ãã©ã«ãã«åºåäºå®ãããç»åãé
ç½®ããã®ç»åã¨ã®å·®åãè¦ã¦ããã¹ããã¦ãããããã§ãã
ãã®éã«åé¡ã«ãªãã®ãã ãmatplotlibãåºåããç»åã®ã«ã©ã¼ã¹ãã¼ã ãOSãããã¤ã¹ãã¨ã«å¾®å¦ã«ç°ãªããã¨ããåé¡ã§ãããã®åé¡ã解決ããããã«pytestã使ã£ã¦ãããç¨åº¦ã®å·®åå¹ ã§ããã°è¨±å®¹ãããã¨ãããã¨ããã¦ãããããã§ããéçºã®äºä¾ã¨ãã¦ãé¢ç½ãã§ããã
tests/baseline_images
ãã©ã«ããè¦ãã¨å®éã«æ¯è¼ã«ä½¿ããã¦ããç»åãè¦ããã¨ãã§ãã¾ãã
https://github.com/DistrictDataLabs/yellowbrick/tree/develop/tests/baseline_images
Yellowbrickã使ã£ã¦ã¿ãææ³
Yellowbrickã¯scikit-learnãmatplotlibãªã©ã®ã©ã¤ãã©ãªã«æ £ãã¦ããªã人ãæ £ãã¦ãã人ã両æ¹ã«ã¨ã£ã¦æç¨ãªã©ã¤ãã©ãªã ã¨æãã¾ããã
ç¹å¾´éãçµæã®å¯è¦åãããã«ã¯ãscikit-learnãnumpyãscipyã使ã£ã¦å¾ããããã¼ã¿ãmatplotlibãseabornã«å¯¾å¿ããå½¢å¼ã«ãã¦APIãå¼ã³åºãå¿ è¦ ãããã¾ããããããã®APIã«ç¿çãã¦ããªã人ã«ã¨ã£ã¦ã¯ãããããã®APIãåãå ¥ãã/åºåãããã¼ã¿ã®å½¢å¼ãç解ããå¿ è¦ããããéçºè ãã¬ã³ããªã¼ã§ã¯ãªãã£ãã¨æãã¾ããYellowbrickã¯ãã®åé¡ã解決ãã¦ãããã©ã¤ãã©ãªã ã¨æãã¾ãã
ã¾ããããã®APIæ £ãã¦ãã人ã«ã¨ã£ã¦ãã大ããªã¡ãªãããããã¾ãã èªåããããªã®ã§ãããå¯è¦åã§ãã使ã(scikit-learnã¨matplotlibãçµã¿åããã)ã³ã¼ããã¹ããããã¨ãã¦ä½¿ã£ã¦ããæ¹ãå¤ãã¨æãã¾ãããããã£ãã¹ããããã¯ä¾¿å©ãªã®ã§ãããè²ã ãªã±ã¼ã¹ã§ä½¿ãã¾ããã¦ããã¨æãã¬ãã°ã®åå ã«ãªãã¾ããYellowbrickã使ãã°ãã¹ããããã³ã¼ãã§å¯è¦åã§ããã®ã§ããå®å ¨ã«å¯è¦åãããã¨ãã§ãã¾ããã¾ãã·ã³ãã«ã«æéã®ç縮ã«ãªãã¾ãã
ãããã«
æ©æ¢°å¦ç¿ã®å¯è¦åãæããã¼ã«ãYellowbrickããç´¹ä»ããã¾ãããããã«ä½¿ããå®ç¨çãªã©ã¤ãã©ãªã§ããã éçºã®æ´»çºã«è¡ããã¦ãããããªã®ã§ãã©ãã©ã使ã£ã¦ãã£ã¼ããã㯠or ã³ã³ããªãã¥ã¼ã·ã§ã³ãã¦ããã¾ãããã
ä»å使ã£ãã³ã¼ãã¯GitHubã«ããã¦ããã¾ãã®ã§ãåç §ãã ããã ãã¼ã¿èªã¿è¾¼ã¿ãªã©ãã®è¨äºã§ã¯çç¥ããã¦ããé¨åããã¹ã¦æ®ãã¦ããã¾ãã