ãã¸ã¹ãã£ãã¯å帰ã®ã³ã¹ãé¢æ°ãçºãã
å
æãããAndrew Ng å
çã«ãã Coursera ã®æ©æ¢°å¦ç¿ã®ã³ã¼ã¹ãåè¬ãã¦ãã¾ããåé± 1 ã³ãã®åéã«ãªãããã«æ§æããã¦ãã¦ãåç»ã«ãã説æãè´ãããã®é±ã®ç¯å²ã«é¢ãã宿é¡ãæåºããªããé²ãã¦ããå½¢å¼ã§ããç¾å¨ã第 3 é±ã®ãã¸ã¹ãã£ãã¯å帰ã¾ã§é²ãã ã¨ããã§ããããã§ä»åã¯ããã¸ã¹ãã£ãã¯å帰ã«ã¤ã㦠MATLAB ã§ã°ã©ããæããªããéãã§ã¿ã¾ãããªãã第 3 é±ã®å¾åã§ã¯æ£ååã®è©±ãããããã§ãããå®ã¯ã¾ã ååããçµãã¦ããªãã®ã§ãä»åã®è¨äºã§ã¯æ£ååã«ã¤ãã¦ã¯èãã¾ããã
www.coursera.org
ãã¸ã¹ãã£ãã¯å帰ã¯ãå¤å¥åé¡ã«é©ç¨ãããææ³ã§ãã ã¨
ã®é¢ä¿ã次ã®å¼ã§äºæ¸¬ãã¾ã*1ã
ã¯ããã
ã§
ã¨ãªã確çãäºæ¸¬ããé¢æ°ã§ã
ã®ã¨ãã«
ã¨ãããã§ãªãã¨ãã«
ã¨å¤å¥ãã¾ãã
ã«ããäºæ¸¬ãã©ã®ãããå½ãããã¯ããã©ã¡ã¼ã¿
ã®å¤ã«ä¾åãã¾ãã
çµã®ãã¼ã¿
ãä¸ããããã¨ãã
ã«ããäºæ¸¬ã®ãæªãããã³ã¹ãé¢æ°
ã§è©ä¾¡ã§ãã¾ããããã§ãå
±å½¹å¾é
æ³ãªã©ã®ææ³ãç¨ãã¦ã
ãæå°ã«ãããã©ã¡ã¼ã¿
ãæ±ãã¾ãã
ã¾ãã ãã©ã®ãããªå½¢ããã¦ããã®ããã°ã©ããæãã¦æ§åãè¦ã¦ã¿ã¾ãã
ãè¨ç®ããé¢æ°ã次ã®ããã«å®è£
ãã¾ããã
function value = predict(theta, X) value = sigmoid(X * theta); end function y = sigmoid(x) y = 1 ./ (1 + exp(-x)); end
ãã®é¢æ°ã使ã£ã¦ã次ã®ããã°ã©ã ã§ã°ã©ããæç»ãã¾ãã ã«åºå®ãã¦ã
ã® 4 éãã®ã°ã©ããæãã¦ã¿ã¾ããã
x = (-10:0.1:10)'; figure; title('\theta_1 ãå¤ããã¨ãã® h_\theta(x) ã®ã°ã©ã'); xlabel('x'); ylabel('h_\theta(x)'); legends = {}; hold on; for t1 = -2:2:4 plot(x, predict([t1; 1], [ones(length(x), 1) x])); legends{end+1} = sprintf('\\theta = (%d, %d)', t1, 1); end legend(legends);
å®è¡çµæã¯æ¬¡ã®ã¨ããã§ãã ãåºå®ãã¦
ãå¤ããã¨ãã°ã©ãã¯å¾ããå¤ããã«å·¦å³ã«åãã¾ããå®é
ã®å¼ãèããã¨ã
ãèå¥å¢ç
ã«ãªããã¨ããããã¾ãã
åæ§ã«ã ãå¤ããã¨ãã®ã°ã©ãã以ä¸ã«ç¤ºãã¾ããå·¦ã¯
ã®å ´åãå³ã¯
ã®å ´åã§ãã
ã®å¤ã«ãã£ã¦ã°ã©ãã®å¾ããå¤ããæ§åããããã¾ãã
ã§ã°ã©ãã¯æ°´å¹³ã«ãªãã
ã§ã¯å³ä¸ããã®ã°ã©ãã«ãªãã¾ãã
ããã§ã¯æ¬¡ã«ãã³ã¹ãé¢æ° ãå®è£
ãã¦ãé©å½ãªãã¼ã¿ã«å¯¾ããæé©ãª
ãæ±ãã¦ã¿ã¾ããã³ã¹ãé¢æ°ã¯æ¬¡ã®ããã«å®è£
ã§ãã¾ãã
function value = cost(theta, X, y) h = predict(theta, X); value = -mean(y .* log(h) + (1 - y) .* log(1 - h)); end
次㮠4 ç¹ãããªãç°¡åãªãµã³ãã«ãã¼ã¿ãèãã¾ãã
x | -1.0 | -0.1 | 0.1 | 1.0 |
---|---|---|---|---|
y | 0 | 1 | 0 | 1 |
次ã®ã³ã¼ãã¯ããã®ãµã³ãã«ãã¼ã¿ãä½æãã¦ãããã¤ãã® ã«ã¤ãã¦ã³ã¹ããè¨ç®ãã¦ã¿ããã®ã§ãã試ãã 3 éãã®ä¸ã§ã¯
ãæãè¯ãããã§ãã
x = [-1; -0.1; 0.1; 1]; y = [ 0; 1 ; 0 ; 1]; cost([0; -1], [ones(length(x), 1) x], y) % 0.9788 cost([0; 1], [ones(length(x), 1) x], y) % 0.5288 cost([1; 1], [ones(length(x), 1) x], y) % 0.6371
å¶ç´ç¡ãæå°ååé¡ã解ã fminunc é¢æ°ãå©ç¨ãã¦ãæé©ãª ãæ±ãã¾ããå
ã»ã©å®è£
ãã cost é¢æ°ã¯ 3 å¼æ°ã®é¢æ°ã§ããããfminunc ã«ã¯
ãåãåãé¢æ°ã渡ãå¿
è¦ãããã®ã§ãç¡åé¢æ°ã§ã©ãããã¦æ¸¡ãã¦ãã¾ã*2ãfminunc ã®ç¬¬ 2 å¼æ°ã¯å復è¨ç®ã®åæå¤ã§ãã
theta = fminunc(@(t) cost(t, [ones(length(x), 1) x], y), [0; 0]);
è¨ç®ã®çµæã ã¨æ±ã¾ãã¾ãããå¾ããã
ã§ã³ã¹ããè¨ç®ããã¨ã0.4510 ã«ãªãã¾ããã次ã®ã³ã¼ãã§ããã®
ã§ã®
ãæãã¦ã¿ã¾ãã
xs = (-1.5:0.01:1.5)'; figure; hold on; plot(xs, predict(theta, [ones(length(xs), 1) xs])); plot(x, y, 'o'); xlabel('x'); ylabel('h_\theta(x)');
çµæã¯æ¬¡ã®ã¨ããã§ãã ãããããã大ãããªãã¨ã両端㮠2 ç¹ã¸ã®å½ã¦ã¯ã¾ãã¯æ¹åããã¾ãããä¸å¤®ã® 2 ç¹ã¸ã®å½ã¦ã¯ã¾ããæªããªãã¾ããéã«
ãå°ãããªãã¨ãä¸å¤®ã® 2 ç¹ã¸ã®å½ã¦ã¯ã¾ãã¯æ¹åããã¾ããã両端㮠2 ç¹ã¸ã®å½ã¦ã¯ã¾ããæªããªãã¾ãããããã®ãã©ã³ã¹ãåã£ãæé©è§£ãã°ã©ãã®æ²ç·ã«ãªã£ã¦ãã¾ãã
ãã¦ããã¸ã¹ãã£ãã¯å帰ã«ç¨ããã³ã¹ãé¢æ° ã¯ããã©ã¡ã¼ã¿
ã«é¢ãã¦ã©ã®ãããªå½¢ã«ãªã£ã¦ããã§ããããã次ã®ã³ã¼ãã§ã
ãåãããã¨ãã®
ãæç»ãã¦ã¿ã¾ãã
[T1, T2] = meshgrid(-5:0.025:5, -2:0.025:8); J = arrayfun(@(i) cost([T1(i); T2(i)], [ones(length(x), 1) x], y), 1:length(T1(:))); J = reshape(J, size(T1)); s = surf(T1, T2, J); s.LineStyle = 'none'; view(90, -90); colorbar(); xlabel('\theta_1'); ylabel('\theta_2'); zlabel('J(\theta)');
çµæã¯æ¬¡ã®ã¨ããã§ããä¸ã«å¸ãªæ²é¢ã«ãªã£ã¦ãããå
ã»ã©å¾ããã ã§æå°ã«ãªãã¾ããä¸ã«å¸ã«ãªã£ã¦ããã®ã§ãé©å½ãªåæå¤ã§ fminunc ãå®è¡ãã¦ãå¿
ãæé©è§£ãå¾ããã¾ããããã³ã¹ãé¢æ°ããã³ãã³ãªå½¢ã«ãªã£ã¦ããã¨ãfminunc ã¯å±æ解ã®ä¸ã¤ãæ±ãããã¨ã«ãªããããã¯å¿
ãããå
¨ä½ã®æé©è§£ã¨ã¯éãã¾ããã
Coursera ã®è¬ç¾©ã®ä¸ã§ãç·å½¢å帰ã§ç¨ããã³ã¹ãé¢æ°ããã¸ã¹ãã£ãã¯å帰ã«ä½¿ãã¨ãä¸ã«å¸ã«ãªããªãã¨ãã説æãããã¾ãããå¼ã«æ¸ãã¨ä»¥ä¸ã®ããã«ãªãã¾ãã
ãã¡ãã®ã³ã¹ãé¢æ°ã«ã¤ãã¦ããå®éã©ã®ãããªå½¢ã«ãªã£ã¦ãã¾ãã®ãã°ã©ãã«æãã¦ç¢ºèªãã¦ã¿ã¾ãããã®ã³ã¹ãé¢æ°ã¯æ¬¡ã®ããã«å®è£ ã§ãã¾ãã
function cost = costlin(theta, X, y) h = predict(theta, X); cost = mean((h - y) .^ 2) / 2; end
å
ã»ã©ã®ããã°ã©ã 㧠cost ãå¼ã³åºãã¦ããã¨ããã costlin ã«å¤æ´ããã°ãåæ§ã«ã°ã©ããæç»ã§ãã¾ããçµæã¯æ¬¡ã®ããã«ãªãã¾ãããæªãã å½¢ã«ã¯ãªã£ã¦ãã¾ãããä¸ã«å¸ã«ãªã£ã¦ããããã«ãè¦ãã¾ããããã¯ããµã³ãã«ãã¼ã¿ãåç´ããããã¨ãçç±ããããã¾ããã
ããã§ãããå°ãè¤éãªãµã³ãã«ãã¼ã¿ã使ã£ã¦ç¢ºèªãã¦ã¿ã¾ãã次ã®ããã«ãå¹³åã®ç°ãªãæ£è¦åå¸ãã ã®ãã¼ã¿ãé©å½ã«çºçããã¾ãã
x = normrnd([-3 * ones(10, 1); -1 * ones(5, 1); ones(10, 1); 2 * ones(5, 1)], 1); y = [zeros(10, 1); ones(5, 1); zeros(10, 1); ones(5, 1)];
çæãããã¼ã¿ã«å¯¾ãã¦æé©ãªãã©ã¡ã¼ã¿ã¯ ã¨ãªãã¾ãããããããããããããã®ã次ã®å³ã§ãã
å
ã»ã©ã¨åæ§ã«ã³ã¹ãé¢æ°ãæãã¦ã¿ããã®ã次ã®å³ã§ããå·¦ã ãç¨ãããã®ãå³ã
ãç¨ãããã®ã§ãã
ã¯ããã®ãããªè¤éãªãã¼ã¿ã«å¯¾ãã¦ãæ¯è¼ç綺éºãªå½¢ããã¦ãããä¸ã«å¸ãªæ²é¢ã«ãªã£ã¦ãã¾ããä¸æ¹ã§
ã¯å¤§ããæªãã å½¢ããã¦ããã
ã®ãããã«éé¨ããããã¨ããããã¾ã*3ã
ã³ã¹ãé¢æ°ãå³ã®å³ã®ããã«ãªã£ã¦ããã¨ãfminunc ã«ä¸ããåæå¤ã«ãã£ã¦ç°ãªã解ãæ±ã¾ãã¾ãããã®ãã¨ã確èªãã¦ã¿ã¾ããfminunc ã costlin é¢æ°ã使ãããã«ã㦠ãè¨ç®ãã¾ããåæå¤ã (0, 0) ã¨ãã¦è¨ç®ããã¨ã(-0.6773, 0.3383) ã¨ãã解ãå¾ããã¾ãããä¸å³å·¦ããã®ãã©ã¡ã¼ã¿ã§æç»ããã°ã©ãã§ããä¸æ¹ãåæå¤ã (-3, 2) ã¨ãã¦è¨ç®ããã¨ã(-2179.7, 1279.5) ã¨ãã解ãå¾ããã¾ããããããéé¨ã®å³ä¸å´ã«è½ã¡ã¦ãã£ãå ´åã§ãã°ã©ããæç»ããã¨ä¸å³å³ã«ãªãã¾ãã1 ã¤ã®ãã¼ã¿ç¾¤ãå®å
¨ã«ç¡è¦ãããã¨ã§æ®ãã®ãã¼ã¿ãå®å
¨ã«èª¬æã§ããå½¢ã«ãªã£ã¦ãããããã¯ããã§é¢ç½ãçµæã§ããã
ãã¦ãããã§å¾ããã 2 ã¤ã®è§£ã«ã¤ã㦠ã§ã³ã¹ããè¨ç®ãã¦ã¿ãã¨ã左㯠0.1006ãå³ã¯ 0.0833 ã¨ãªãããã¾ã¨ããªãçµæã«è¦ããå·¦ã®æ¹ãå®ã¯å±æ解ã§ãå³å´ã®æ¹ãã³ã¹ãã®å°ããªè§£ã«ãªã£ã¦ãããã¨ããããã¾ããããã¯ã³ã¹ãé¢æ°ã®æ§è³ªã«ãããã®ã§ã
ã§ã¯ããããã¼ã¿ã«å¯¾ããäºæ¸¬ãå®å
¨ã«ééãã¦ã (
ã¨ãªã£ã¦ã)ãããã«ããã³ã¹ãã®å¢å ã¯é«ã
ã«ãããªããªãã®ã§ããã®ãããªçµæã«ãªãã¾ãã
ã§ã¯ããã®ãããªå®å
¨ãªééãã«å¯¾ãã¦æ¯æãã³ã¹ãã¯ç¡é大ã«ãªãã®ã§ãå³å³ã®ãããªãã©ã¡ã¼ã¿ã«å¯¾ãã
ã¯å¤§ããªå¤ã«ãªãã¾ã*4ã
*1:ä»åã®è¨äºã§ã¯ãç°¡åã®ããã« ãä¸æ¬¡å
ã¨ãã¦ãã¾ãã
ãå¤æ¬¡å
ã§ãåæ§ã«é©ç¨ã§ãã¾ãã
*2:æ¬æ¥ã¯ãé¢æ°ã®å¾é ãè¨ç®ãã¦ä¸ããã®ãããããã§ããä»åã®ä¾ã®ããã«å¾é ãä¸ããã«å®è¡ããã¨ãæºãã¥ã¼ãã³æ³ã¨ããã¢ã«ã´ãªãºã ã§è¨ç®ãããããã§ãã
*3:è¨äºã«åãè¾¼ãã ç»åã§ã¯ãããã«ããããããã¾ãããã¯ãªãã¯ãã¦æ¡å¤§ç»åã表示ããã¨å°ãè¦ãããã¨æãã¾ãã
*4:ä»åå¾ããããã©ã¡ã¼ã¿ã§ã¯è¨ç®ããªã¼ãã¼ããã¼ãã¦ãã¾ããå ·ä½çãªå¤ã¯æ±ã¾ãã¾ããã§ããã