ç¸é¢ã«ã¼ã«åæ(é »åºãã¿ã¼ã³ãã¤ãã³ã°ï¼)ã§ä½¿ããããªããå¤(lift)ã¨ãå ±èµ·ã®å¼·ãã測ãã®ã«ä½¿ãããPMIãã»ã¼åããã®ã ãªãã¨æã£ãã®ã§ã¡ã¢(ãã以ä¸ã®å 容ã¯ãªã)
ç¸é¢ã«ã¼ã«åæ
Association rule learning - Wikipedia
相関ルール - 機械学習の「朱鷺の杜Wiki」
ç¸é¢ã«ã¼ã«åæã¯ååãè³¼å
¥ããã¨ãã®ãã¹ã±ãããã¼ã¿åæãã¬ã³ã¡ã³ãçãªã®ã«ä½¿ãããææ³ã§ã
ããããã®1ååã®ãã¼ã¿(è³¼å
¥ãªã©)ããã©ã³ã¶ã¯ã·ã§ã³ã¨å¼ãã§ãããã¢ã¤ãã \(X\)ãå«ã¾ãããã©ã³ã¶ã¯ã·ã§ã³ã«ã¯\(Y\)ãä¸ç·ã«å«ã¾ãã¦ãããã¨ã示ãããã®ãã®ã§ã
åºæ¬çãªããæ¹ã¨ãã¦ä»¥ä¸ã®ãããªæ°å¤ãè¨ç®ãã¦ããããé©å½ã«å¤§ãããã®ãé¢é£ã®ãããã®ã¨æ¨å®ã§ããã®ã§æ¨è¦ãªã©ã«ä½¿ãã¾ã
1åã®ãã©ã³ã¶ã¯ã·ã§ã³ã«ã¢ã¤ãã \(X\)ãå«ã¾ãã確çã\(P(X)\)ã¨ããã¨ã
support(æ¯æ度)ã¯2ã¤ã®ã¢ã¤ãã \(X, Y\)ã1ã¤ã®ãã©ã³ã¶ã¯ã·ã§ã³ã«åæã«å«ã¾ãã確çã表ãã¦ãã¾ã
$$\mathrm{support}(X\Rightarrow Y) = P(X, Y)$$
confidence(確信度)ã¯\(X\)ãå«ããã©ã³ã¶ã¯ã·ã§ã³ã«\(Y\)ãå«ã¾ãã確çã§ã
$$\mathrm{confidence}(X\Rightarrow Y) = P(Y|X)$$
lift(ãªãã)ã¯ã¢ã¤ãã \(X, Y\)ã1ã¤ã®ãã©ã³ã¶ã¯ã·ã§ã³ã«åæã«å«ã¾ãã確çã\(X\)ã¨\(Y\)ãããããå«ã¾ãã確çãç¬ç«ã§ããã¨èããã¨ããããã©ãããã大ãããã表ãã¾ã
$$\mathrm{lift}(X\Rightarrow Y) = \frac{P(X, Y)}{P(X) P(Y)}$$
PMI(Pointwise Mutual Information)
Pointwise mutual information - Wikipedia
自然言語処理における自己相互情報量 (Pointwise Mutual Information, PMI) | キャンベルとヨセミテ
PMIã¯2ã¤ã®åºæ¥äºãä¸ç·ã«èµ·ãã度åãã®ãã¨ã§(å
±èµ·å°ºåº¦)ããããå¶ç¶ããã大ãããå°ãããã表ãã¦ãã¾ã
ä¾ãã°èªç¶è¨èªå¦çã§ã¯é¢ä¿ã®ããåèªã®ãã¢ã調ã¹ãã®ã«ä½¿ã£ãããã¾ã
å¼ã¯ä»¥ä¸ã®ãããªæãã§\(x, y\)ã®2ã¤ã®åºæ¥äºãèµ·ãã確çãã\(x,y\)ããããã®åºæ¥äºãèµ·ãã確çã§å²ã£ããã®ã«\(\log\)é¢æ°ãä»ãããã®ã§ã
$$\mathrm{pmi}(x, y) = \log\frac{p(x,y)}{p(x)p(y)} = \log\frac{p(x|y)}{p(x)} = \log\frac{p(y|x)}{p(y)}$$
ææ¸ããè¨äºã§ã¯ãã®è¾ºã®è¨äºã§ä½¿ãã¾ãã
sucrose.hatenablog.com
sucrose.hatenablog.com
liftã¨PMI
ããã¾ã§èªãã çããã¯ããã£ããã¨æãã¾ããã2ã¤ã®å¼ãè¦æ¯ã¹ãã¨logãã¤ãã¦ãããã¨ãããã¨ä»¥å¤ã¯ã¾ã£ããåãå¼ã§ã
$$\mathrm{lift} = \frac{P(X, Y)}{P(X) P(Y)}$$$$\mathrm{pmi}(x, y) = \log\frac{p(x,y)}{p(x)p(y)} = \log\frac{p(x|y)}{p(x)} = \log\frac{p(y|x)}{p(y)}$$
logã¯å調å¢å é¢æ°ãªã®ã§ããã¢ã¤ãã ã«å¯¾ããæ¨è¦ã«ä½¿ãåã«ã¯logã®æç¡ã§å¤§å°é¢ä¿ãå¤ãããªãã®ã§liftã使ã£ã¦ãPMIã使ã£ã¦ãåæ§ã®çµæãå¾ãããã£ã½ãã§ã