æè¿å¿ããã¦*1ãPRML ã®äºç¿ãæ»ãä¸ã
ãããã次の PRML 読書会ã«å¾æ空æ³ã§è¡ã£ãããæ°æã¡ããæ天ãã¦ãã¾ããããªã®ã§ããªãã¨ãé å¼µã£ã¦èªãã§ã¿ãã
EM ã¢ã«ã´ãªãºã ã¯ä½ã¨ãªããããããå¤åãã¤ãºãããããâ¦â¦
ã¨ããããã§ãOld Faithful ã®æ··åæ£è¦åå¸ã§ã®æ¨è«ã K-means 㨠EM ã¨å¤åãã¤ãºã«ã¤ãã¦ãï¼²ã§å®è£
ãã¦ã¿ãã
K-means
Old Faithful + K-means ã«ã¤ãã¦ã¯ããã§ã« 前回の記事でお試し済みã
ãã®è¨äºã§ã¯ãã¤ãã¬ã¼ã·ã§ã³ãï¼è¡ã§æ¸ãã¦ãã¿ã£ã½ããã¦ãã¾ã£ã¦ãã®ã§ãããããããæ´çããã®ã以ä¸ã®ã³ã¼ãã
è·é¢ãåãã¨ããã¯å°ãå¤ãã¦çããã¦ããã
# Old Faithful dataset ãåå¾ãã¦æ£è¦å data("faithful"); xx <- scale(faithful, apply(faithful, 2, mean), apply(faithful, 2, sd)); # ã¯ã©ã¹æ° K <- 2; # ä¸å¿ã®åæå¤(æ£è¦ä¹±æ°) mu <- matrix(rnorm(K*ncol(xx)), nrow=K); # 以éãåæããã¾ã§ç¹°ãè¿ã # åç¹ã«ã¤ãã¦ãä¸çªè¿ãμ_iãæ¢ã nearest_index <- max.col(-sapply(1:K, function(i) { colSums((t(xx)-mu[i,])^2) })); # å i ãã¨ã«ãã¯ãã«ã®å¹³åãåããæ°ããμã¨ãã mu <- t(sapply(1:K, function(k) { colMeans(xx[nearest_index==k,] })));
ã©ã¤ãã©ãªãä½ãã®ãç®çã§ã¯ãªãã®ã§ãç¹°ãè¿ãã¯æåãåæã¯é°å²æ°ã§ï½ã
å®è¡ã®æ§åã¯ååã®è¨äºåç
§ã
å®è£
ã®ãã¤ã³ããã®ï¼ã¯ã max.col() ã¨ããåè¡ãã¨ã«æ大å¤ã®ã¤ã³ããã¯ã¹ãè¿ãé¢æ°ãå®éã«æ¬²ããã®ã¯æå°å¤ãªã®ã§ã符å·å転ãã¦çªã£è¾¼ãã§ãã( min.col() ã¯ç¡ãã®ã )ã
å®è£
ã®ãã¤ã³ããã®ï¼ã¯ããããã£ã¦å¾ã nearest_index(ä¸çªè¿ã u_i) ã xx[nearest_index==k,] ã¨ä½¿ããã¨ã§ãu_k ã«è¿ããã¯ãã«ãæ½åºãcolMeans() ã§ãã®å¹³åãåã£ã¦ããã¨ããã
ï¼²ã¯ããããã®ãçãæ¸ããã®ãå¬ããã
ããéããã¨ã¢ã¯ãããã£ãã¯ã«ãªã£ã¦èªããªããªããâ¦â¦
EM Algorithm
EM ã¢ã«ã´ãªãºã ã¯ãåæåå¸ P(X,Z) ã®æé©åã¯å®¹æã«å¯è½ãã¨ããä»®å®ã®ä¸ãå¨è¾ºåå¸ P(X) ãæé©åãã代ããã«ããäºå¾åå¸ã®ãã¨ã§ã®å¯¾æ°å°¤åº¦ã®æå¾
å¤ Î£ P(Z|X) lnP(X,Z)ããæ大åãããã¨ãããã®ã
PRML 9.4 ã§ã¯ããããªã KL ãã¤ãã¼ã¸ã§ã³ã¹ã«å解ããã ãªãã¦å¤§ä¸æ®µã«æ¯ããã¶ã£ã¦ããããJensen ä¸çå¼ããå°åºãã¦ãã§ãããã£ã¦ KL ã ããï¼ãã¨ããã¦ãããã»ãããå人çã«ã¯è
ã«è½ã¡ãã
ã£ã¦ãEM ã¢ã«ã´ãªãºã ã«ã¤ãã¦èª¬æããã®ãç®çã§ã¯ãªããããããããã«ãã¦ããã¦ãEM ã¢ã«ã´ãªãºã ã§ã®æ··åæ£è¦åå¸æ¨è«ãå®è£
ã
ãï¼²ã¯ããããã®ãçãæ¸ããã®ãå¬ãããã¨æ¸ããããããé·ã(è¦ç¬)ã
ã§ããä¸çªæåã«æ¸ãä¸ãã奴ã¯ããã®å以ä¸ãã£ãã
å
±åæ£ã¾ãããã«ã¼ããåãæ¹æ³ããæãã¤ããªãã®ãæå ã
ãã£ã¨ï¼æ¬¡å
ãªãã³ã¼ãéï¼åã®ï¼ã«ãªããã ãã©ã
# Old Faithful dataset ãåå¾ãã¦æ£è¦å data("faithful"); xx <- scale(faithful, apply(faithful, 2, mean), apply(faithful, 2, sd)); # ã¯ã©ã¹æ° K <- 2; # å¹³åãå ±åæ£ãæ··åçã®åæå¤(æ£è¦ä¹±æ°) mu <- matrix(rnorm(K*ncol(xx)), nrow=K); mix <- numeric(K)+1/K; sig <- list(); for(k in 1:K) sig[[k]] <- diag(ncol(xx)); # å¤æ¬¡å æ£è¦åå¸å¯åº¦é¢æ°(ããã±ã¼ã¸ä½¿ãã£ã¦ï¼) dmnorm <- function(x,mu,sig) { D <- length(mu); 1/(sqrt((2 * pi)^D * det(sig))) * exp(- t(x-mu) %*% solve(sig) %*% (x-mu) / 2)[1]; } # EM ã¢ã«ã´ãªãºã ã® E ã¹ããã Estep <- function(xx, mu, sig, mix) { K <- nrow(mu); t(apply(xx, 1, function(x){ numer <- sapply(1:K, function(k) { mix[k] * dmnorm(x, mu[k,], sig[[k]]) }); numer / sum(numer); })) } # EM ã¢ã«ã´ãªãºã ã® M ã¹ããã Mstep <- function(xx, gamma_nk) { K <- ncol(gamma_nk); D <- ncol(xx); N <- nrow(xx); N_k <- colSums(gamma_nk); new_mix <- N_k / N; new_mu <- (t(gamma_nk) %*% xx) / N_k; new_sig <- list(); for(k in 1:K) { sig <- matrix(numeric(D^2), D); for(n in 1:N) { x <- xx[n,] - new_mu[k,]; sig <- sig + gamma_nk[n, k] * (x %*% t(x)); } new_sig[[k]] <- sig / N_k[k] } list(new_mu, new_sig, new_mix); } # 以éãåæããã¾ã§ç¹°ãè¿ã gamma_nk <- Estep(xx, mu, sig, mix); (ret <- Mstep(xx, gamma_nk)); mu <- ret[[1]]; sig <- ret[[2]]; mix <- ret[[3]];
ã追è¨ã西尾さんからの指摘 ã«ãããdmnorm ã®æ£è¦åä¿æ°ãééã£ã¦ããã®ãä¿®æ£ããã¾ãï¼ãããã¨ããã§ããdmnorm ãå¼ãã å¾ã«æ£è¦åãã¦ããã®ã§çµæã«ã¯å½±é¿ãªããããã£ãããã£ã(ããï¼è¿½è¨ã
PRML 9.2.2 ã®ãæ··åã¬ã¦ã¹åå¸ã®ããã®ï¼¥ï¼ã¢ã«ã´ãªãºã ãã®éãã®å®è£ ã§ãE ã¹ããã㨠M ã¹ããããããããããã¦ãããåæããã¾ã§äº¤äºã«å¼ã³åºãã¦ããã
以ä¸ã¯å®è¡ãç¹°ãè¿ãã¦ã»ã¼åæããã¨ããã
K-means ã¨ã ãããåããããã«å¹³åãåºã¦ãããã¨ããããã
> (ret <- Mstep(xx, gamma_nk)); [[1]] eruptions waiting [1,] -1.2716236 -1.207692 [2,] 0.7025575 0.667236 [[2]] [[2]][[1]] eruptions waiting [1,] 0.05309447 0.02804473 [2,] 0.02804473 0.18232160 [[2]][[2]] eruptions waiting [1,] 0.13047113 0.06061833 [2,] 0.06061833 0.19503065 [[3]] [1] 0.3558729 0.6441271
å
¨åºåãè¼ãããã¨ããã ããåºåãå¤ãä¸ãPRML ã«ãæ¸ãã¦ããã¨ããåæãé
ãã®ã§ããã¦ãããã
K-means ã¯ï¼åãåãã°åæãã¦ãã¾ãã®ã ããEM ã ã¨ï¼ï¼å以ä¸ãããã
ã¡ããã©ä»ãå¥å£ã§ pLSI ãåã£ã¦ããã®ã ãããã¡ããåæããªãããªãã
online EM ã¯åæãéãã¨ããåãªã®ã§ãã¡ãã£ã¨è©¦ãã¦ã¿ããã¨ããã
ãã¡ãããã£ã¼ããæãã¦ãããã
åãã¼ã¿ç¹ã®è² æ
ç(ã©ã®ã¬ã¦ã¹åå¸ããçæããã¦ããã)ã¯é ãå¤æ°ã®äºå¾ç¢ºç p(z_nk=1|X) ã§ãããããã γ(z_nk) ã ã£ãããã ãããE ã¹ãããã®è¿å¤ gamma_nk ã使ãã°ãPRML ã®å³ 9.8 ã®ãããªå³ãæããã¨ãåºæ¥ã*2ã
plot(xx, col=rgb(gamma_nk[,1],0,gamma_nk[,2]), xlab=paste(sprintf("%1.3f",t(mu)),collapse=","), ylab=""); points(mu, col = 1:2, pch = 8)
12åã®ç¹°ãè¿ãã¾ã§ããã¦ã¿ãããå¹³åãããããåãã¦ããæ§åãè¦ããã ãããã
ããè¦ãã¨ãã¾ã å
¨ç¶åæãã¦ããªããã»ãã¨é
ãã
ã¡ãªã¿ã«ãä¸å¿ã¾ããã«å®è£
ãã¦ããã®ã§ãK=3 ã¨ãã«ãã¦ãã¡ããã¨åãã
ï¼æ¬¡å
以ä¸ã§ãåãâ¦â¦ã¨æããã©ã試ãã¦ãªãã
ãã¨ãPRML 9.2.1 ã§èª¬æãã¦ãããã¬ã¦ã¹åå¸ã®ï¼ã¤ãï¼ç¹ã§ã¤ã¶ãã¦ãã¾ããããªç¹ç°æ§ãåç¾ããã¦ã¿ãã
å¹³åã®çæ¹ããã¼ã¿ç¹ã®ï¼ã¤ã«ä¸è´ãããªããã«ã¼ããåããããã®ãã¨ã§ã¯å
¨ç¶çºçããªãã£ãããåæ£ãæ£æçã«å°ãããã¦ããã¨ããã®ã¾ã¾ï¼ã«ãªã£ã¦ãã¾ããã¨ã確èªã§ããã
é·ããªã£ãã®ã§ãå¤åãã¤ãºã¯ã¾ã次åã
*1:ãæ¬æã®ã¬ã®ã³ã¬ã¤ã´ãã¨ããã²ã¼ã ãããã¾ãã¦ã
*2:çé«ç·ã¯â¦â¦