ååã«ç¶ãã¦ããç¶ã»ãããããããã¿ã¼ã³èªèãã®ã¢ã«ã´ãªãºã ãå®è£ ãã¾ããä»å㯠Baum-Welch ã¢ã«ã´ãªãºã ãå®è£ ãã¦ãé ããã«ã³ãã¢ãã«ã®ãã©ã¡ã¼ã¿æ¨å®ã試ãã¦ã¿ã¾ãã
ååãã¢ã«ã´ãªãºã
ã¾ããååãã¢ã«ã´ãªãºã ãå®è£ ãã¾ãã以ä¸ã®ããã«å®è£ ã§ãã¾ããå¼æ°ã¯ãA, B, rho ãããããé·ç§»ç¢ºçãåºå確çãåæç¶æ ãx ãåºåè¨å·ç³»åã§ããæ»ãå¤ã® a ã¯æç§æ¸ã® (8.8) å¼ã§å®ç¾©ããã α ã§ãããã ããã¢ã³ãã¼ããã¼ãé¿ããããã« α ãã®ãã®ã§ã¯ãªã log(α) ãè¨ç®ãã¾ããæ·»åã¯ãè¡æ¹åã i, åæ¹åã t ã¨ãã¦ãã¾ãã
function a = Forward(A, B, rho, x) % Step 1 åæå a(:, 1) = log(rho') + log(B(:, x(1))); % Step 2 å帰çè¨ç® for t = 2:length(x) c = max(a(:, t-1)); a(:, t) = log(((exp(a(:, t-1) - c))' * A))' + log(B(:, x(t))) + c; end end
Step 2 ã®å帰çè¨ç®ã§ã¯ãlogsumexp ã¨å¼ã°ããè¨ç®ææ³ãå©ç¨ãã¦ãã¾ãããã®æç§æ¸ã§ã¯ logsumexp ã«ã¯è§¦ãããã¦ããªãããã§ãããã¦ã§ãã§æ¤ç´¢ããã°æ å ±ãè¦ã¤ããã¾ããæ¸ç±ã§ã¯ããã¨ãã°ãè¨èªå¦çã®ããã®æ©æ¢°å¦ç¿å ¥éãã®ä»é² A.2 ã«èª¬æãããã¾ãã
ååãã¢ã«ã´ãªãºã ã®åä½ã確èªãã¾ããæç§æ¸ã® (8.116) å¼ã(8.117) å¼ã®ã¨ããã«ãã©ã¡ã¼ã¿ A, B, rho ãå®ç¾©ãã¾ããåºåè¨å·ç³»å x ã (8.118) å¼ã®ã¨ããã«å®ãã¾ãã
>> A = [ 0.1 0.7 0.2 0.2 0.1 0.7 0.7 0.2 0.1 ]; >> B = [ 0.9 0.1 0.6 0.4 0.1 0.9 ]; >> rho = [1 1 1] / 3; >> x = [1 2 1];
Forward é¢æ°ãå®è¡ããã¨ã以ä¸ã®ããã« log(α) ãæ±ã¾ãã¾ããããã α ã«æ»ãã¦å¤ã確èªããã¨ãããããæç§æ¸ã®å¤ã«ä¸è´ãã¦ãããã¨ããããã¾ã*1ãæç§æ¸ (8.13) å¼ã«ããããã«ãæå¾ã®æå»ã§ã® i ã«é¢ããåã P(x) ã«ãªãã¾ãããã®ä¾ã§ã¯ 0.1734 ã¨æ±ã¾ãããããã« (8.120) å¼ã®å¤ã¨ä¸è´ãã¾ãã
>> alpha = Forward(A, B, rho, x) alpha = -1.2040 -4.6742 -2.0161 -1.6094 -2.3574 -3.4559 -3.4012 -1.6983 -4.7510 >> exp(alpha) ans = 0.3000 0.0093 0.1332 0.2000 0.0947 0.0316 0.0333 0.1830 0.0086 >> sum(exp(alpha(:, end))) ans = 0.1734
å¾ãåãã¢ã«ã´ãªãºã
å¾ãåãã¢ã«ã´ãªãºã ãååãã¢ã«ã´ãªãºã ã¨åæ§ã«å®è£ ã§ãã¾ãã以ä¸ã®ããã«ãªãã¾ãã
function b = Backward(A, B, rho, x) % Step 1 åæå b(:, length(x)) = log(ones(length(rho), 1)); % Step 2 å帰çè¨ç® for t = length(x)-1:-1:1 c = max(b(:, t+1)); b(:, t) = log(A * exp(log(B(:, x(t+1))) + b(:, t+1) - c)) + c; end end
å¾ãåãã¢ã«ã´ãªãºã ã®åä½ã確èªãã¾ããååãã¢ã«ã´ãªãºã ã§å©ç¨ãããã®ã¨åããã¼ã¿ãç¨ãã¦é¢æ°ãå®è¡ããã¨ã以ä¸ã®ããã« β ãæ±ã¾ãã¾ãã
>> beta = Backward(A, B, rho, x) beta = -1.4745 -0.6349 0 -0.6896 -1.1712 0 -2.0379 -0.2744 0
β ã®å®ç¾© (8.9) ãããP(x) ã¯æå» t = 1 ã§ã® β ãç¨ãã¦æ¬¡ã®ããã«è¨ç®ã§ãã¾ãã
P(x) = Σ_i{P(s1=wi) * P(x1|s1=wi) * β(t=1, s1=wi)}
ä»åã®ä¾ã§ã¯ 0.1734 ã¨æ±ã¾ããα ããæ±ããå ´åã¨ä¸è´ãã¾ãã
>> sum(rho' .* B(:, x(1)) .* exp(beta(:, 1))) ans = 0.1734
Baum-Welch ã¢ã«ã´ãªãºã
Baum-Welch ã¢ã«ã´ãªãºã ã¯æ¬¡ã®ããã«å®è£ ãã¾ãããå¼æ°ã¯åºåè¨å·ç³»å x ã¨ç¶æ æ° nstate ã§ããæç§æ¸ 8.6 ç¯ã§ã¯ A, B, rho ã®åæå¤ãåºå®ãã¦ãã¾ãããç§ã®å®è£ ã§ã¯æ½å¨ç¶æ æ°ãæå®ãã¦ã©ã³ãã ãªå¤ãå²ãå½ã¦ãããã«ãã¾ãããé¢æ°ã®æ»ãå¤ã¯ãæ¨å®ããããã©ã¡ã¼ã¿ A, B, rho ã¨ãåå復ã§ã®å¯¾æ°å°¤åº¦ logP(x) ã§ã*2ã
function [A, B, rho, logLH] = BaumWelch(x, nstate) maxiter = 100; % æ大å復åæ°ã 100 åã¨ãã epsilon = 1e-3; % 対æ°å°¤åº¦ã®å¢å ã 1e-3 æªæºãªãåæããã¨ã¿ãªãã¦çµäºãã % Step 1 åæå [A, B, rho] = initialize(nstate, max(x)); % Step 2 å帰çè¨ç® a = Forward(A, B, rho, x); b = Backward(A, B, rho, x); % Step 4 å¤å® (åæãã©ã¡ã¼ã¿ã§ã®å¯¾æ°å°¤åº¦ãè¨ç®ãã) logLH(1) = calcLikelihood(a); for i = 2:maxiter % Step 3 ãã©ã¡ã¼ã¿ã®æ´æ° [A, B, rho] = maximize(A, B, x, a, b); % Step 2 å帰çè¨ç® a = Forward(A, B, rho, x); b = Backward(A, B, rho, x); % Step 4 å¤å® logLH(i) = calcLikelihood(a); if logLH(i) - logLH(i-1) < epsilon break; end end end function [A, B, rho] = initialize(c, m) A = normalize(rand(c)); B = normalize(rand(c, m)); rho = normalize(rand(1, c)); end function [A, B, rho] = maximize(A, B, x, a, b) g = calcGamma(a, b); xi = calcXi(A, B, x, a, b); A = normalize(sum(exp(bsxfun(@minus, xi, max(max(xi, [], 3), [], 2))), 3)); for k = 1:size(B, 2) B(:, k) = sum(exp(bsxfun(@minus, g(:, find(x == k)), max(g, [], 2))), 2); end B = normalize(B); rho = normalize(exp(g(:, 1) - max(g(:, 1)))'); end function g = calcGamma(a, b) g = a + b; end function xi = calcXi(A, B, x, a, b) c = size(A, 1); t = length(x) - 1; xi = repmat(permute(a(:, 1:end-1) , [1 3 2]), [1 c 1]) ... + repmat(permute(log(A) , [1 2 3]), [1 1 t]) ... + repmat(permute(log(B(:, x(2:end))), [3 1 2]), [c 1 1]) ... + repmat(permute(b(:, 2:end) , [3 1 2]), [c 1 1]); end function l = calcLikelihood(a) c = max(a(:, end)); l = log(sum(exp(a(:, end) - c))) + c; end function M = normalize(M) M = bsxfun(@rdivide, M, sum(M, 2)); end
æç§æ¸ã®ã¢ã«ã´ãªãºã 説æã§ã¯ãStep 3 ã§ãã©ã¡ã¼ã¿ã®æ´æ°ãè¡ã£ãå¾ãStep 4 ã®å¤å®ã§å¯¾æ°å°¤åº¦ãè¨ç®ãã¦ãã¾ããã¨ãããã対æ°å°¤åº¦ãè¨ç®ããã«ã¯ α ã β ãæ±ããå¿ è¦ãããããã㯠Step 2 ã®å帰çè¨ç®ã®å¦çã«ç¸å½ãã¾ãããã®ãããä¸è¿°ã®å®è£ ä¾ã®ããã«å¦çã®é çªãå ¥ãæ¿ãã¾ããã
calcGamma é¢æ°ãcalcXi é¢æ°ã§ã¯ã(8.20) å¼ã® γ, (8.43) å¼ã® ξ ãè¨ç®ãã¾ãããã ããã¢ã³ãã¼ããã¼ãé¿ããããã«å¯¾æ°ã§ã®è¨ç®ã¨ãã¦ãããã¾ããåå¼ã®åæ¯ã¯è¨ç®ãã¾ããããããã®å¼ã®åæ¯ã¯åã 1 ã«ããããã®ä¿æ°ã§ãããããã£ã¦å®è£ ä¸ã¯ãååãè¨ç®ãã¦ããç·åãæ±ãã¦å²ãã°ãåãçµæã«ãªãã¾ããnormalize é¢æ°ã§ãããå®ç¾ãã¦ãã¾ãã
Baum-Welch ã¢ã«ã´ãªãºã ã®åä½ç¢ºèª
Baum-Welch ã¢ã«ã´ãªãºã ã«ãããã©ã¡ã¼ã¿æ¨å®ã確èªãã¾ããæåã®ä¾ã¨ãã¦ã1, 2, 3 ãç¹°ãè¿ãç³»åã®ãã©ã¡ã¼ã¿ãæ¨å®ãã¾ããç¶æ æ°ã 2 ã¨ããæ¨å®ã¯ä»¥ä¸ã®ããã«ãªãã¾ãããè¡å B ãè¦ãã¨ãç¶æ 1, 2 ã®ãããã®å ´åã§ãåºåè¨å· 1, 2, 3 ãã»ã¼ç確çã§åºåããã¾ãããã¾ãä¸æãæ¨å®ã§ãã¦ããªãããã§ãã
>> x = repmat([1 2 3], 1, 100); >> [A, B, rho, logLH] = BaumWelch(x, 2) A = 0.8570 0.1430 0.7402 0.2598 B = 0.3383 0.3305 0.3312 0.3077 0.3480 0.3443 rho = 0.9874 0.0126 logLH = -365.7825 -332.0131 -330.6177 -330.1137 -329.8861 -329.7684 -329.7018 -329.6614 -329.6358 -329.6190 -329.6076 -329.5998 -329.5943 -329.5903 -329.5875 -329.5854 -329.5839 -329.5828 -329.5820
ä¸æ¹ãåãä¾ã§ç¶æ æ°ã 3 ã«ããã¨æ¬¡ã®ããã«ãªãã¾ããããã¡ãã¯ãç¾å¨ã®ç¶æ ã«ãã£ã¦åºåè¨å·ãä¸æã«æ±ºã¾ãã次ã®ç¶æ ã¸ã®é·ç§»ãä¸æã«æ±ºã¾ãã¨ããæ¨å®çµæã«ãªãã¾ãããåºåç¶æ ç³»å㯠1, 2, 3, 1, 2, 3, ... ã¨ç¹°ãè¿ããã®ã§ããã®ã§ããã®çµæã¯å¦¥å½ãªãã®ã ã¨æãã¾ãã
>> [A, B, rho, logLH] = BaumWelch(x, 3) A = 0.0000 1.0000 0.0000 0.0000 0.0000 1.0000 1.0000 0.0000 0.0000 B = 0.0000 0.0000 1.0000 1.0000 0.0000 0.0000 0.0000 1.0000 0.0000 rho = 0.0000 1.0000 0.0000 logLH = -322.1001 -265.6989 -98.4127 -3.6947 -0.0005 -0.0000
次ã®ä¾ã¨ãã¦ãx ã®ãã¨ã« x èªèº«ãå転ããããã®ãé£çµãã¾ãã1, 2, 3, 1, 2, 3, ... ãç¹°ãè¿ããã®ã¡ã..., 3, 2, 1, 3, 2, 1, ... ã¨éåããã¯ãããåºåè¨å·ç³»åã§ããå ã»ã©ã¨åæ§ã« 3 ç¶æ ã§æ¨å®ããã¦ã¿ãçµæã以ä¸ã§ããç¶æ 1 ã®ã¨ãã« 2 ãåºåããç¶æ ã 2, 3 ã®ã¨ãã«ã¯ 1, 3 ã®ãããããåºåããã¢ãã«ãå¾ããã¾ããã
>> x = [x x(end:-1:1)]; >> [A, B, rho, logLH] = BaumWelch(x, 3) A = 0 1.0000 0.0000 0.0000 0.0000 1.0000 1.0000 0.0000 0.0000 B = 0.0000 1.0000 0.0000 0.5000 0.0000 0.5000 0.5000 0.0000 0.5000 rho = 0 0.0000 1.0000 logLH = -730.4094 -652.2383 -644.0695 -633.4681 -618.4745 -597.6392 -574.0333 -558.5107 -554.4464 -553.7989 -553.3115 -552.1486 -548.6988 -538.3057 -508.0520 -434.2421 -331.4732 -282.3566 -277.3188 -277.2591 -277.2589
ç¶æ æ°ã 6 ã«ããã¨æ¬¡ã®çµæãå¾ããã¾ãããç¶æ 4 ããå§ã¾ã£ã¦ã4, 1, 2, 4, 1, 2, ... ã®ç¶æ ãç¹°ãè¿ãã¾ããB ãè¦ãã¨ãåºåè¨å·ç³»å㯠1, 2, 3, 1, 2, 3, ... ã«ãªããã¨ãåããã¾ããç¶æ 2 ãã㯠1% ã®ç¢ºçã§ç¶æ 3 ã«é·ç§»ãã¾ããç¶æ 3 ãã㯠3, 5, 6, 3, 5, 6, ... ãç¹°ãè¿ãã対å¿ããåºåè¨å·ç³»å㯠3, 2, 1, 3, 2, 1, ... ã¨ãªãã¾ãã
>> [A, B, rho, logLH] = BaumWelch(x, 6) A = 0.0000 1.0000 0.0000 0.0000 0 0.0000 0.0000 0.0000 0.0100 0.9900 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 1.0000 0.0000 1.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 1.0000 0.0000 0.0000 1.0000 0.0000 0.0000 0.0000 B = 0.0000 1.0000 0.0000 0.0000 0.0000 1.0000 0.0000 0.0000 1.0000 1.0000 0.0000 0.0000 0.0000 1.0000 0.0000 1.0000 0.0000 0.0000 rho = 0 0 0 1.0000 0.0000 0.0000 logLH = -680.2880 -647.1606 -636.3205 -619.0222 -581.4084 -495.4729 -346.5039 -200.9830 -116.3330 -71.1324 -33.2078 -9.9379 -5.6963 -5.6002 -5.6002
[ä»¥ä¸ 2015-06-05 追è¨]
æå¾ã®ä¾ã¨ãã¦ãæç§æ¸ã® 8.6 ç¯ã¨åãè¨å®ã§ãã©ã¡ã¼ã¿ãæ¨å®ãã¦ã¿ã¾ããã¾ãã(8.116), (8.144) ã®ã¨ããã«çã®ãã©ã¡ã¼ã¿ãå®ãã¾ãã
>> A = [ 0.1 0.7 0.2 0.2 0.1 0.7 0.7 0.2 0.1 ]; >> B = [ 0.9 0.1 0.6 0.4 0.1 0.9 ]; >> rho = [1 0 0];
ååã®è¨äºã§å®è£ ãã GenerateSample é¢æ°ã使ã£ã¦ã観測åæ° n = 10000 ã®åºåè¨å·ç³»åãçæãã¾ãããªããåæã«ç¶æ ç³»åãçæããã¾ãããããã¯ä»åã®å®é¨ã§ã¯ä½¿ãã¾ããã
>> [s, x] = GenerateSample(A, B, rho, 10000);
æç§æ¸ã® (8.145), (8.146) ã®ã¨ããã«æ¨å®ã®åæå¤ãå®ãã¾ãããããã A1, B1, rho1 ã¨ãã¾ããã
>> A1 = [ 0.15 0.60 0.25 0.25 0.15 0.60 0.60 0.25 0.15 ]; >> B1 = ones(3, 2) .* 0.5; >> rho1 = [1 0 0];
ä»åä½æããããã°ã©ã ã§ã¯æ¨å®ã®åæå¤ãã©ã³ãã ã«çæãã¦ãã¾ããã®ã§ãåæå¤ãå¼æ°ã¨ãã¦æ¸¡ããããã«å¾®ä¿®æ£ãã¾ããä¿®æ£å¾ã®ã³ã¼ãã¯æ¬¡ã®ããã«ãªãã¾ããæç§æ¸ã®è¨è¿°ã«ãã»ã¼ 150 åã§åæãããã¨ããã®ã§ãçµäºæ¡ä»¶ãå¤æ´ãã¾ããã
% function [A, B, rho, logLH] = BaumWelch(x, nstate) function [A, B, rho, logLH] = BaumWelch(x, A, B, rho) maxiter = 1000; % 100 ãã 1000 ã«å¤æ´ãã¾ãã epsilon = 1e-4; % 1e-3 ãã 1e-4 ã«å¤æ´ãã¾ãã % [A, B, rho] = initialize(nstate, max(x)); a = Forward(A, B, rho, x); b = Backward(A, B, rho, x); ...
ãã©ã¡ã¼ã¿æ¨å®ã®çµæã¯ä»¥ä¸ã®ã¨ããã§ããããã©ã¡ã¼ã¿ A ã¯çã®å¤ã«è¿ãæ¨å®çµæãå¾ããã¦ãã¾ãããB ã®æ¹ã¯è¡ãå ¥ãæ¿ãããããªæ¨å®çµæã«ãªã£ã¦ãã¾ãããã®ããã«ãªãåå ã¯ãããããã¾ããã§ããã
>> [Ae, Be, rhoe, logLH] = BaumWelch(x, A1, B1, rho1); >> Ae Ae = 0.0632 0.7522 0.1846 0.2198 0.1260 0.6542 0.7467 0.1518 0.1015 >> Be Be = 0.1244 0.8756 0.8749 0.1251 0.5959 0.4041 >> rhoe rhoe = 1 0 0
ãã®æ¨å®å®é¨ã®åå復ã§ã®å¯¾æ°å°¤åº¦ã¯ã次ã®ã°ã©ãã®ã¨ããã§ããããããæç§æ¸ã®å³ 8.5 ã§ã¯ -3000 ç¨åº¦ããã¯ãã¾ã -2930 ãããã§åæãã¦ããã®ã«å¯¾ãã¦ãä»åã®å®é¨ã§ã¯ -6900 ãã -6750 ç¨åº¦ã¨ä½ãå¤ã«ãªã£ã¦ãã¾ãã
çã®ãã©ã¡ã¼ã¿ã¨æ¨å®ããããã©ã¡ã¼ã¿ã®ãããã㧠logP(x) ãè¨ç®ãã¦ã¿ãçµæã¯ä»¥ä¸ã®ã¨ããã§ãã
>> a = Forward(A, B, rho, x); >> log(sum(exp(a(:,end) - max(a(:, end))))) + max(a(:,end)) ans = -6.7482e+03 >> ae = Forward(Ae, Be, rhoe, x); >> log(sum(exp(ae(:,end) - max(ae(:, end))))) + max(ae(:,end)) ans = -6.7458e+03
ã¾ããããããã®ãã©ã¡ã¼ã¿ã®ãã¨ã§ç¶æ ç³»åã®æå°¤æ¨å®ãè¡ãæ¡ä»¶ä»ã確ç logP(x|s) ãè¨ç®ããã¨ã次ã®ããã«ãªãã¾ããã
>> s_ml = Viterbi(A, B, rho, x); >> sum(log(B(sub2ind(size(B), s_ml, x)))) ans = -3.5488e+03 >> se_ml = Viterbi(Ae, Be, rhoe, x); >> sum(log(Be(sub2ind(size(Be), se_ml, x)))) ans = -3.9603e+03 % åºåè¨å·ç³»åãçæããã¨ãã®ç¶æ ç³»åã®ãã¨ã§ã®æ¡ä»¶ä»ã確ç (æ¯è¼ã¨ãã¦) >> sum(log(B(sub2ind(size(B), s, x)))) ans = -4.3743e+03