æ¬è¨äºã¯ãå½ç¤¾ãªã¦ã³ãã¡ãã£ã¢ãDoorsãã«ç§»è»¢ãã¾ããã
ç´5ç§å¾ã«èªåçã«ãªãã¤ã¬ã¯ããã¾ãã
ããã«ã¡ã¯ãã¢ããªãã£ã¯ã¹ãµã¼ãã¹æ¬é¨ã®å°ç°ã§ãã
å¯ããªãã¨ã¯ã·ã£ãããããã³ã«ã®ãã¯ãªè °ã®äºæããã¾ãã
ãã®ã¨ããæ¥åã§ã¬ã³ã¡ã³ãã¼ã·ã§ã³ã«è§¦ãããã¨ãå¤ãã®ã§ãæ¬ããã°ã§ã¯ã¬ã³ã¡ã³ãã«ã¤ãã¦ãå®éã«RãPythonã§ã³ã¼ããæ¸ããªãããããããã¨èå¯ãã¦ããããã¨æã£ã¦ãã¾ãã
ä»å¾è¤æ°åã«ããã£ã¦ã¬ã³ã¡ã³ãææ³ã®æ¦å¿µãå®è£ æ¹æ³ãä¸å¿ã«ãåºç¤çãªå 容ããæè¿æµè¡ãã®æè¡ã¾ã§å¹ åºã触ããäºå®ã§ããã¾ãææ³ä»¥å¤ã«ããè©ä¾¡æ¹æ³ãã³ã¼ã«ãã¹ã¿ã¼ãåé¡ã«ä»£è¡¨ãããã¬ã³ã¡ã³ãã¼ã·ã§ã³ã®èª²é¡ãªã©ãæ§ã ãªãããã¯ã«è§¦ãããã¨æã£ã¦ãã¾ããä¸å®æã«æ¸ãã¦ããäºå®ã§ããããã®è¾ºã¯ãããµãã¨ãããã¨ã§ãäºæ¿ãã ããã
ãã¦ãä»åã¯ã¬ã³ã¡ã³ãã¢ã«ã´ãªãºã ã®åºæ¬ã¨ãè¨ãããå調ãã£ã«ã¿ãªã³ã°ãã«ã¤ãã¦ãæ©æ¢°å¦ç¿ææ³ã®kNNå帰ã¨ã®æ¯è¼ãéãã¦èå¯ãã¾ãã
å調ãã£ã«ã¿ãªã³ã°ã¨ã¯
ã¬ã³ã¡ã³ãææ³ãç°¡åã«åé¡ããã¨ã以ä¸ã®ãããªå³ã¨ãªãã¾ãããã®ä¸ã§ãå調ãã£ã«ã¿ãªã³ã°ã¯ãåºç¾©ã«ã¯ã¦ã¼ã¶ã®å©ç¨å±¥æ´ï¼ãã©ã³ã¶ã¯ã·ã§ã³ãã¼ã¿ãã¦ã¼ã¶ã»ã¢ã¤ãã è¡åï¼ãå©ç¨ããã¬ã³ã¡ã³ãææ³å ¨ä½ãæãã¾ãããã®ãããåºç¾©ã®æå³ã§ã®ãå調ãã£ã«ã¿ãªã³ã°ãã«å«ã¾ããã¢ã«ã´ãªãºã ãææ³ã¯é常ã«å¤ããªãã¾ããååãã¦ã¼ã¶ã®å±æ§æ å ±ã使ã£ã¦ã¬ã³ã¡ã³ãããããå 容ãã¼ã¹ãã£ã«ã¿ãªã³ã°ãã¨ã®å¯¾æ¯ã¨ãã¦ãã®è¨èã¯ä½¿ããã¾ãã
å調ãã£ã«ã¿ãªã³ã°ã¯æ´ã«ãè¿åãã¼ã¹ï¼ã¡ã¢ãªãã¼ã¹ï¼ã¨ã¢ãã«ãã¼ã¹ã«åé¡ããã¾ãã2ã¤ã®éãã¯ã¬ã³ã¡ã³ãï¼æ¨å®ï¼ãããéã«ãã©ã³ã¶ã¯ã·ã§ã³ãã¼ã¿ããã®ã¾ã¾å©ç¨ããã®ããäºåã«ã¢ãã«æ§ç¯ãè¡ãã®ãã«ããã¾ããç義ã«å調ãã£ã«ã¿ãªã³ã°ã¨ãã£ãå ´åãåè ã®è¿åãã¼ã¹ã¢ããã¼ããæããã¨ãå¤ãã¨æãã¾ããæ¬ããã°ã§ãã©ã¼ã«ã¹ãã¦ããã®ãããã®è¿åãã¼ã¹å調ãã£ã«ã¿ãªã³ã°ã«ãªãã¾ãã
å調ãã£ã«ã¿ãªã³ã°ã®ã³ã³ã»ããã¨åé¡
è¿åãã¼ã¹å調ãã£ã«ã¿ãªã³ã°ã®ã³ã³ã»ããã¯ããé¡ä¼¼åº¦ãããã¼ã«ãªã£ã¦ãã¾ããéå»ã®å©ç¨å±¥æ´ããä¼¼ããã®å士ãæããã«ãããã®é¡ä¼¼åº¦ã使ã£ã¦ãªã¹ã¹ã¡ååãæ¨å®ãã¦ããã¾ããé¡ä¼¼åº¦ã¨ãã£ãå ´åãã¦ã¼ã¶å士ã®é¡ä¼¼åº¦ã¨ååå士ã®é¡ä¼¼åº¦ã®2ã¤ãèãããã¾ãããåè
ã使ãå調ãã£ã«ã¿ãªã³ã°ãã¦ã¼ã¶ãã¼ã¹ãå¾è
ãã¢ã¤ãã ãã¼ã¹ã¨è¨ã£ã¦åºå¥ããããã¾ãã
ä¾ãã°ãã¦ã¼ã¶ãã¼ã¹ã®å調ãã£ã«ã¿ãªã³ã°ã§ãé¡ä¼¼åº¦ãã¨ããã³ã³ã»ãããèããå ´åãèªåã¨å¥½ã¿ãè¡åãä¼¼ã¦ããã¦ã¼ã¶ããã¯ãç§å¥½ã¿ã®ååãç¥ã£ã¦ããã¯ãã ããã®äººéã«ãªã¹ã¹ã¡ãèãããã¨ãããã®ã§ããããã¯æ¥å¸¸æã
ãèãããããªãããèªç¶ãªçºæ³ã§ãã
å調ãã£ã«ã¿ãªã³ã°ã®ç¹å¾´
è¿åãã¼ã¹å調ãã£ã«ã¿ãªã³ã°ã¯å¤ãããããææ³ã§ãããã³ã³ã»ãããå®è£ ãã·ã³ãã«ã§ããããè¯ã精度ãåºãããç¾å¨ãåºãå©ç¨ãããæ§ã ãªæ¹è¯ãè¡ããã¦ãã¾ããã¾ãããã®ãããªç¹å¾´ãããå種ææ³ãæ¤è¨ããéã®ãã¼ã¹ã©ã¤ã³ã¨ãã¦ãããã使ããã¾ããããã¦å調ãã£ã«ã¿ãªã³ã°ã®ç¹çãã¹ãé åã¯ãå©ç¨å±¥æ´ããããã°ããã©ããªã¦ã¼ã¶ãªã®ããããã©ããªååãªã®ãããç¥ããªãã¦ãã¬ã³ã¡ã³ããå¯è½ãªç¹ã§ããä¾ãã°ãããªããææ¥çªç¶ç¥èãå ¨ããªãäºæ¥åéã§ECãµã¤ãã®ãã¼ã±æ å½è ã¨ãªã£ã¦ããã¦ã¼ã¶IDã¨ååIDãããªãå©ç¨å±¥æ´ã®ãã¼ã¿ããããã°ããããªãã®ã¬ã³ã¡ã³ãã·ã¹ãã ãæ§ç¯ãããã¨ãå¯è½ã§ãã極端ãªä¾ã§ãããåçã¨ãã¦ã¯åååããç¥ãå¿ è¦ã¯ããã¾ããã
å調ãã£ã«ã¿ãªã³ã°ã®èª²é¡é¢ã«ã¤ãã¦ã¯å¾æ®µã§è¿°ã¹ãã¨ãã¦ãã¬ã³ã¡ã³ãã¼ã·ã§ã³ã·ã¹ãã ã®ä½ç³»çãªåé¡ã詳ãã説æã«ã¤ãã¦ã¯ãä»ã«ç´ æ´ããã調æ»ãã¾ã¨ããããã¾ãã®ã§ã詳細ãæ°ã«ãªãæ¹ã¯ä»¥ä¸ã®ãããªãµã¤ããåèã«ãã¦ã¿ã¦ãã ããã
- [æ¨è¦ã·ã¹ãã ã®ã¢ã«ã´ãªãºã /ç¥å¶ æå¼ ](http://www.kamishima.net/archive/recsysdoc.pdf)
- [Recsys 2014 Tutorial - The Recommender Problem Revisited/Xavier Amatriain ](http://www.slideshare.net/xamat/recsys-2014-tutorial-the-recommender-problem-revisited)
注ï¼"recommender system"ã§ã°ã°ãã¨è²ã åºã¦ãã¦è¯ããããã¾ãã
ã¦ã¼ã¶ãã¼ã¹å調ãã£ã«ã¿ãªã³ã°ã®ãã¸ãã¯
ãããããã¦ã¼ã¶å士ã®é¡ä¼¼åº¦ã«ãã¨ã¥ãã¦ããªã¹ã¹ã¡ã決å®ããã¦ã¼ã¶ãã¼ã¹å調ãã£ã«ã¿ãªã³ã°ã®ãã¸ãã¯ããèå¯ãã¦ããã¾ããã¦ã¼ã¶ãã¼ã¹å調ãã£ã«ã¿ãªã³ã°ã®ãã¸ãã¯ã¯ãããããkNNå帰ã¨ããæ©æ¢°å¦ç¿ææ³ã«ãªãã¾ããkNNï¼k-Nearest-Neighborãkè¿åæ³ï¼ãç°¡åã«èª¬æããã¨ãæ¨å®ãã対象ã«æãç¹å¾´ãä¼¼ã¦ããkåã®è¦³æ¸¬å¤ãåèã«ããå¤ãæ¨å®ãããã¨ãããã®ã§ããkNNã¯ä¸è¬ã«åé¡åé¡ã«é©ç¨ããããã¨ãå¤ãã§ããããªã¹ã¹ã¡åº¦ã®ãããªé£ç¶å¤ãæ¨å®ããå ´åãæ示çã«kNNå帰ã¨å¼ãã§ãã¾ãã
æé
å ·ä½çãªæé ãè¦ããããæ ç»ã®ã¬ãã¥ã¼ãµã¤ããé¡æã¨ãã¦èãã¾ããã¦ã¼ã¶ãèªåãè¦ãæ ç»ä½åãè©ä¾¡ä»ãã§ãããµã¤ãã§ããããã§ãããã¦ã¼ã¶ï¼é´æ¨ããï¼ãã¾ã è©ä¾¡ãã¦ããªãä½åã«å¯¾ãã¦å調ãã£ã«ã¿ãªã³ã°ã使ã£ã¦è©ä¾¡å¤ãæ¨å®ããæ¨å®ãããè©ä¾¡ã®é«ãä¸ä½10ä½åï¼Top-Nã¬ã³ã¡ã³ãï¼ãã¬ã³ã¡ã³ãããã¨ãã¾ããããåç´åããã¨ãæé ã¯ä»¥ä¸ã®ããã«ãªãã¾ãã
1. é´æ¨ããã¨ä¼¼ããããªè©ä¾¡ããã¦ããï¼é¡ä¼¼åº¦ã®é«ãï¼ã¦ã¼ã¶ãæ¢ã
2. é¡ä¼¼åº¦ã®é«ãä¸ä½k人ã®è©ä¾¡ãåèã«ãé´æ¨ãããè©ä¾¡ãã¦ããªãæ ç»ã®è©ä¾¡å¤ãæ¨å®ãã
3. æ¨å®è©ä¾¡å¤ã®é«ãä¸ä½10ã®æ ç»ãæ½åºãã
ãããå ¸åçãªã¦ã¼ã¶ãã¼ã¹å調ãã£ã«ã¿ãªã³ã°ã«ããTop-Nï¼ããã§ã¯N=10ï¼ãªã¹ãä½ææé ã§ããã¾ãä¸è¨æé ã®1ã¨2ã®é¨åããkNNå帰ã«ããæ¨å®ã«è©²å½ãã¾ããå ·ä½çãªè©ä¾¡å¤ãç¨ãã¦æ¨å®ãã¦ã¿ã¾ãããã
å ·ä½çãªä¾
ä¸è¨ã®ãããªè©ä¾¡å±¥æ´ï¼ã¦ã¼ã¶-ã¢ã¤ãã ã®è©ä¾¡å¤è¡åï¼ããã£ãã¨ãã¾ããé´æ¨ããã®ãã¼ã¯ãã¤ãï¼The Dark Knight ;Batmanï¼ã«å¯¾ããè©ä¾¡å¤ã¯ããã¤ã¨æ¨å®ã§ããã§ããããã
ã¦ã¼ã¶ | ä½å1 | ä½å2 | ä½å3 | ä½å4 | ãã¼ã¯ãã¤ã |
é´æ¨ãã | 5 | 3 | 4 | 2 | ? |
ã¦ã¼ã¶1 | 3 | 1 | 2 | 3 | 3 |
ã¦ã¼ã¶2 | 4 | 3 | 4 | 2 | 5 |
ã¦ã¼ã¶3 | 3 | 3 | 1 | 5 | 4 |
ã¦ã¼ã¶4 | 1 | 5 | 5 | 2 | 1 |
ã¨ããããRã§ãã¼ã¿ãå
¥åããè©ä¾¡å¤ã®ãã¼ããããããããããã¾ãã
library(ggplot2) library(tidyverse) # è©ä¾¡å¤è¡åã®ä½æ rating.mtx <- matrix( c(5, 3, 4, 2, NA, 3, 1, 2, 3, 3, 4, 3, 4, 2, 5, 3, 3, 1, 5, 4, 1, 5, 5, 2, 1 ) ,nrow=5 ,byrow=TRUE ) # è¡ã»åã©ãã«ä»ä¸ row.lbl <- c("é´æ¨ãã", "ã¦ã¼ã¶1", "ã¦ã¼ã¶2", "ã¦ã¼ã¶3", "ã¦ã¼ã¶4") col.lbl <- c("ä½å1", "ä½å2", "ä½å3", "ä½å4", "ãã¼ã¯ãã¤ã") dimnames(rating.mtx) <- list(row.lbl, col.lbl) # ããããã®ããã®æ´å½¢ï¼wide -> longï¼ rating.mtx.p <- rating.mtx %>% as.data.frame() %>% tibble::rownames_to_column("userId") %>% tidyr::gather(key="movieId", value="rating", -userId) %>% dplyr::mutate(userId=factor(userId, levels=row.lbl[5:1]), movieId=factor(movieId, levels=col.lbl)) # ããããï¼ãã¼ããããï¼ (g <- ggplot(rating.mtx.p, aes(movieId, userId)) + geom_tile(aes(fill=rating)) + geom_text(aes(label = rating), color="gray30") + scale_fill_gradient(low="white",high="orange", na.value = "gray90"))
ã±ã£ã¨è¦ãæããé´æ¨ããã®è©ä¾¡ã¯ã¦ã¼ã¶2ã¨ä¼¼ã¦ããæãããã¾ãããã©ãã§ããããé¡ä¼¼æ§ãå®éåããããã«ããªãããã®é¡ä¼¼åº¦ãç®åºããå¿
è¦ãããã¾ããããã§ã¯ã¦ã¼ã¯ãªããé¡ä¼¼åº¦ï¼è·é¢ï¼ãç¨ãã¦é¡ä¼¼åº¦ãè¨ç®ãã¦ã¿ã¾ããï¼ä¸æ¦ãããã§ã¯é¡ä¼¼åº¦ãåç´ã«è·é¢ã®éæ°ã¨ãã¦ã以ä¸ã®ããã«æ¸ãã¾ãããæ®éã¯åæ¯ã«1ã足ãã¨æãã¾ããï¼
# é¡ä¼¼åº¦è¡åã®è¨ç® sim.mtx <- as(1/(dist(x=rating.mtx[,1:4], method = "euclidean", upper=TRUE, diag=TRUE)), 'matrix') # é´æ¨ããã¨ä»ã®ã¦ã¼ã¶ã®é¡ä¼¼åº¦ã®ã¿è¡¨ç¤º print(sim.mtx[,"é´æ¨ãã",drop=FALSE])
- | é´æ¨ãã |
é´æ¨ãã | 0.0000000 |
ã¦ã¼ã¶1 | 0.2773501 |
ã¦ã¼ã¶2 | 1.0000000 |
ã¦ã¼ã¶3 | 0.2132007 |
ã¦ã¼ã¶4 | 0.2182179 |
確ãã«ã¦ã¼ã¶2ãè¿ãã§ãããé´æ¨ããèªèº«ã¨ã®é¡ä¼¼åº¦ã¯ãªãã0ã§ããInfãããªãã®ã...
ç®å®ã¾ã§ã«ã¦ã¼ã¯ãªããè·é¢ã«åºã¥ãããé´æ¨ããã¨ä»ã®ã¦ã¼ã¶éã®è·é¢é¢ä¿ãå³ç¤ºãã¦ããã¾ãããã
â»åã¦ã¼ã¶ã®åå¨ä¸ã®ä½ç½®ï¼è§åº¦ï¼ã¯ã©ã³ãã ãªã®ã§æå³ã¯ããã¾ããã
é¡ä¼¼åº¦ãåºã¾ããã®ã§ããã®ä¸ããé´æ¨ããã¨é¡ä¼¼åº¦ã®é«ãã¦ã¼ã¶ã3人(k=3)é¸ãã§ããã¼ã¯ãã¤ãã«å¯¾ããè©ä¾¡ãæ¨å®ãã¾ããé¡ä¼¼åº¦ã®é«ã人ã®è©ä¾¡ãããéè¦ããããã«ãé¡ä¼¼åº¦ãéã¿ã¨ãã¦3人ã®å¹³åè©ä¾¡å¤ï¼å éå¹³åï¼ãã¨ãã¾ãã
# é´æ¨ããã«å¯¾ããé¡ä¼¼åº¦ãæ½åºï¼é´æ¨ããèªèº«ã¨ã®é¡ä¼¼åº¦ã¯é¤å¤ï¼ sim.suzuki <- sim.mtx[rownames(sim.mtx) != "é´æ¨ãã","é´æ¨ãã"] # é¡ä¼¼åº¦ã®æãé«ã3人ãé¸åº(k=3) NN.row.num <- head(order(sim.suzuki, decreasing=TRUE), n=3) (NN <- names(sim.suzuki)[NN.row.num]) # >[1] "ã¦ã¼ã¶2" "ã¦ã¼ã¶1" "ã¦ã¼ã¶4" # ãã¼ã¯ãã¤ãã®è©ä¾¡å¤ãæ¨å®ï¼å éå¹³åï¼ (sum(rating.mtx[NN,"ãã¼ã¯ãã¤ã"] * sim.suzuki[NN]) / sum(sim.suzuki[NN])) # >[1] 4.045465
ãã§ããè©ä¾¡å¤ãæ¨å®ã§ãã¾ãããæ¨å®ããããã¼ã¯ãã¤ãã®è©ä¾¡å¤ã¯4.0ã¨é«ããªã®ã§ãçµæ§ãªã¹ã¹ã¡ã®ããã§ãã
ä¸å¿ãä»å¾ã®æ¯è¼ã®ãããé¡ä¼¼åº¦ã¨æ¨å®æ¹æ³ãæ°å¼ã§æ¸ãã¦ããã¾ããã¨ã¦ãã·ã³ãã«ãªãã®ã§ãã
â»ãã¦ã¼ã¶ã®ã¢ã¤ãã ã¸ã®è©ä¾¡å¤ã¨ããã¯ã¦ã¼ã¶ã®æè¿åã¦ã¼ã¶(k-nearest neighbor)ã®éåã¨ãã¾ãã
kNNå帰ã®ã©ã¤ãã©ãªã§ãè¨ç®ãã¦ã¿ã
念ã®ãããæ¢åã®kNNå帰ã®ã©ã¤ãã©ãªã使ã£ã¦ãæ¨å®å¤ãåãã«ãªãã®ã確ããã¦ã¿ã¾ããRã«kNNå帰ãæ軽ã«ã§ããããªããã±ã¼ã¸ãè¦å½ãããªãã£ãã®ã§ãPythonã®scikit-learnãå©ç¨ãã¾ããçµ±ä¸æããªãã§ãããããã¾ãããããµããªã®ã§ãäºæ¿ãã ããã
import numpy as np from sklearn import neighbors # è©ä¾¡å¤è¡å rating_mtx = np.array([[5, 3, 4, 2, np.NaN], [3, 1, 2, 3, 3], [4, 3, 4, 2, 5], [3, 3, 1, 5, 4], [1, 5, 5, 2, 1]]) # å¦ç¿ç¨ãã¼ã¿ï¼é´æ¨ãã以å¤ï¼ X = rating_mtx[1:5, 0:4] # 説æå¤æ° y = rating_mtx[1:5, 4] # ç®çå¤æ° # æ¤è¨¼ãã¼ã¿ï¼é´æ¨ããï¼ x_suzuki = rating_mtx[0, 0:4] # ã¢ãã«æ§ç¯ knn = neighbors.KNeighborsRegressor(3, weights='distance', metric='euclidean') model = knn.fit(X, y) # è©ä¾¡å¤ã®æ¨å® model.predict([x_suzuki]) # > array([ 4.04546516])
åãã«ãªãã¾ããããã§ããããã§ããã
å調ãã£ã«ã¿ãªã³ã°ã¨kNNå帰ã®å®éã®ç¸é
ãã®ããã«ãå調ãã£ã«ã¿ãªã³ã°ã¨kNNå帰ã®åºæ¬çãªèãæ¹èªä½ã¯åãã§ããããæ ã«ãå調ãã£ã«ã¿ãªã³ã°ã¨kNNã®ç¹å¾´ï¼é·æã課é¡ï¼ãä¼¼éã£ããã®ã¨ãªãã¾ã ã
å調ãã£ã«ã¿ãªã³ã°ã¨kNNã®ç¹å¾´
ç¹°ãè¿ãã«ãªãã¾ãããå調ãã£ã«ã¿ãªã³ã°ããã³kNNã®ç¹å¾´ãã¾ã¨ããã¨
1. ã·ã³ãã«ã§å®è£
ãç°¡å
2. 精度ãããããè¯ã
3. è¨ç®ã³ã¹ããã¹ã±ã¼ã©ããªãã£ã課é¡
ã¨ãããã¨ã«ãªãã¾ããããã¾ã§è¦ãããã«ããã¸ãã¯ãã·ã³ãã«ã§å®è£
ãæ¯è¼ç容æã§ããããã®å²ã«ç²¾åº¦ãããããè¯ãã§ããä¸æ¹ã§äºåã«ã¢ãã«ãæ§ç¯ãã(lazy learningãinstanced-based model)ãæ¨å®ã®é½åº¦ãå
¨ãã¼ã¿ã«å¯¾ãã¦è¨ç®ãè¡ãã¢ããã¼ããã¨ãããããã¼ã¿ã大ãããªã£ãå ´åã«è¨ç®ã³ã¹ããé«ããªãã¾ãããã¡ããè¨ç®éãå¹çããæ¸ããããã®ã¢ã«ã´ãªãºã ã®å·¥å¤«ãªã©ã«ããããã®èª²é¡ã¯æ¹åããã¦ããããã§ãããä»ã®ææ³ã«æ¯ã¹ãã¨ä¾ç¶èª²é¡ã¨ãã¦æãããã¾ãã
ã¾ãå種ææ³ã¨ç²¾åº¦æ¯è¼ããéã®ãã¼ã¹ã©ã¤ã³ã®ãããªç«ã¡ä½ç½®ã«ãããã¨ããæ©æ¢°å¦ç¿ã«ãããkNNã®ç«ã¡ä½ç½®åæ§ã§ãã
kNNã¨ã®å·®ç°
ãã¦å調ãã£ã«ã¿ãªã³ã°ã¨kNNã®ãã¸ãã¯ã®é¡ä¼¼æ§ãè¦ã¦ãã¾ããããä¸è¬çãªkNNã®ã©ã¤ãã©ãªã使ã£ã¦å調ãã£ã«ã¿ãªã³ã°ã«ããã¬ã³ã¡ã³ãã·ã¹ãã ãå®è£ ã§ãããã¨ããã¨ãããã¯å¤åçµæ§é¢åã ã¨æãã¾ããã¬ã³ã¡ã³ãã¼ã·ã§ã³ã¨ä¸è¬çãªæ師ããæ©æ¢°å¦ç¿ã¢ãã«ã§ã¯ãæ³å®ããã¦ããåé¡è¨å®ãç°ãªããããè²ã ã¨ã«ã¹ã¿ãã¤ãºãå¿ è¦ã«ãªãã¾ããå調ãã£ã«ã¿ãªã³ã°ã¨kNNå帰ã®å ·ä½çãªå·®ç°ã¨ãã¦ã主ã«ãé¡ä¼¼åº¦ããç®çå¤æ°ãã®2ç¹ãæãã¦ã¿ã¾ãã
kNNã§ã®è·é¢ã»é¡ä¼¼åº¦ã¯ã¦ã¼ã¯ãªããè·é¢ãããã©ããã¹è·é¢ãå©ç¨ããäºãå¤ãã¨æãã¾ãããå調ãã£ã«ã¿ãªã³ã°ã§ã¯ã³ãµã¤ã³é¡ä¼¼åº¦ããã¢ã½ã³ç¸é¢ãJaccardä¿æ°ãå¤ãã§ããã¾ãã¬ã³ã¡ã³ãã¼ã·ã§ã³ã®åé¡è¨å®ã¯ãã¨ãã¨ã大éã®æ¬ æå¤ãå«ããäºãåæã¨ãã¦ãã®ã§ä¸è¬çãªæ©æ¢°å¦ç¿ã®åæã¨è¶£ãç°ãªã£ã¦ãã¦ããã®è¾ºãã¹ãã¼ã¹æ§ãã³ã¼ã«ãã¹ã¿ã¼ãåé¡ã¨ãã£ãã¬ã³ã¡ã³ãã·ã¹ãã ã§ç¹å¾´çãªèª²é¡ã«ç´çµãã¦ãã¾ãã
â»kNNã«ããæ¬ æå¤å¯¾å¦ã¨ããã¢ããã¼ããããã®ã§ãèå³ã®ããæ¹ã¯ãã©ã¤ãã¦ã¿ãã®ãè¯ãããããã¾ãããï¼"kNN impute"ã¨ãã§ã°ã°ãã¨æ
å ±ãå¤ãã§ããï¼
å®è£ ã«åãã課é¡
åºç¤åçã¯ä»¥ä¸ã§ãããå®éã®ã¬ã³ã¡ã³ãã§ã¯ãããã«ä»¥ä¸ã®ãããªãã¤ã³ãã工夫ãã¦ãããã¨ã«ãªãã§ãããã
- é¡ä¼¼åº¦ã®é¸å®
- ãã¤ã¢ã¹ã®èæ ®ï¼e.g.ã¦ã¼ã¶ãã¨ã®è©ä¾¡ãã¤ã¢ã¹ãªã©ï¼
- æ¬ æå¤ã®å¤ããã¯ãã«å士ã®é¡ä¼¼åº¦ã®ç®åºã¨èª¿æ´
- 大éã®æ¬ æãããå ´åã®æ¨å®
- è©ä¾¡æ¹æ³ã®æ¤è¨
- ãªã½ã¼ã¹ã®å¹ççå©ç¨(e.g.è¡åãã¹ãã¼ã¹è¡åãç¨ããå¹ççãªè¨ç®ãªã©ï¼
- åæ£å¦ç
次å以éã®ããã°ã§ã¯ãä»åç´¹ä»ããã¦ã¼ã¶ãã¼ã¹å調ãã£ã«ã¿ãªã³ã°ã«ã¤ãã¦ãå®éã«æ¬ æã®å¤ãè©ä¾¡å¤è¡åã«ä½¿ã£ã¦å®è£
ããããå調ãã£ã«ã¿ãªã³ã°ãå©ç¨ã§ããã©ã¤ãã©ãªã®ä½¿ãæ¹ããè©ä¾¡æ¹æ³ã®ã¾ã¨ããªã©ã確èªã§ãããã¨æãã¾ãã
é·ããªã£ã¦ãã¾ããã®ã§ãã®è¾ºã§ãããããããã
âãã¬ã³ã¡ã³ãã¤ãã¥ããã®é£è¼è¨äº
blog.brainpad.co.jp
ã
å½ç¤¾ã§ã¯ããã¼ã¿ãµã¤ã¨ã³ãã£ã¹ãã®çãã¾ãç©æ¥µçã«æ¡ç¨ãã¦ãã¾ããããã°ã«èå³ãæã£ãæ¹ã¯ãã²ãå¿åãã ããï¼
ãã¼ã¿ãµã¤ã¨ã³ãã£ã¹ãã®ä»ã«ãã¨ã³ã¸ãã¢ãã³ã³ãµã«ãªã©ã®è·ç¨®ãåéãã¦ãã¾ãã®ã§ããã¼ã¿åæã«èå³ã®ããæ¹ããã²ãå¿åãã ããã
www.brainpad.co.jp