K-meansæ³ã«ããã¯ã©ã¹ã¿ãªã³ã°ã®ã¹ãã¼ããªåæå¤é¸æãè¡ãK-means++
K-meansæ³ã¯ãå ¥åãã¼ã¿ããKåã®ã©ã³ãã ãªåä½ãåæã¯ã©ã¹ã¿ã®ä¸å¿ã¨ãã¦é¸æãã以éãã¯ã©ã¹ã¿ã®éå¿ã移åãããã¹ããããç¹°ãè¿ããã¨ã§ã¯ã©ã¹ã¿ãªã³ã°ãè¡ãéé層çææ³ã§ããK-meansæ³ã¯ã·ã³ãã«ã§é«éã§ãããåæå¤ä¾åã大ããã®ãå¼±ç¹ã§ãä¸é©åãªåæå¤é¸æãããã¨ééã£ã解ã«åæãã¦ãã¾ãã¾ãã
以ä¸ã¯ãIntroduction to Information Retrievalã®16ç« ã«åºã¦ããä¾ã§ãã
{d1, d2, ..., d6}ãK=2ã§ã¯ã©ã¹ã¿ãªã³ã°ããå ´åã{{d1, d2, d4, d5}, {d3, d6}}ã大åæé©è§£ã§ãããåæã¯ã©ã¹ã¿ã®ä¸å¿ãd2, d5ã§ä¸ããã¨ã{{d1, d2, d3}, {d4, d5, d6}}ã¨ãã誤ã£ã解ã«åæãã¦ãã¾ãã¾ãã
ãã®åé¡ãæ¹åããK-means++ã¨ããææ³ãè¦ã¤ããã®ã§ã試ãã¦ã¿ã¾ããã
K-means++ã§ã¯ãåæå¤ãåç´ã«ã©ã³ãã ã«é¸æãã代ããã«ã以ä¸ã®æé ãã¨ãã¾ãã
- 1åç®ã®ä¸å¿ã¯ãã©ã³ãã ã«é¸æããã
- 2åç®ä»¥éã®ä¸å¿ã¯ãããã¾ã§ã«é¸ææ¸ã¿ã®ä¸å¿ã¨ååä½ã®æå°è·é¢ã«åºã¥ã確çåå¸ã«ãã£ã¦é¸æããã
ãã®ãããªåæå¤é¸æãè¡ããã¨ã§ãã¯ã©ã¹ã¿ãªã³ã°ã®æ£ç¢ºæ§ãåä¸ããããã«åæãéããªãããã§ããã¢ã«ã´ãªãºã ã®è©³ç´°ã¯ããªã³ã¯å ã®PDFãåç §ãã¦ãã ããã
ä¸è¨ã®{d1, d2, ..., d6}ã使ã£ã¦K-means++ã試ãã¦ã¿ãã¨ããã97ï¼ ç¨åº¦ã®ç¢ºçã§å¤§åæé©è§£ãå¾ãããããã«ãªãã¾ããï¼K-meansæ³ã§ã¯66.6ï¼ ï¼ã
æ©éHatenarMapsã§ããããã¾ã§ä½¿ã£ã¦ããK-meansæ³ãK-means++ã«ç½®ãæãã¦ã¿ãã®ã§ãä»å¾ã®å°å³æ´æ°ã§ã¯ã¯ã©ã¹ã¿ãªã³ã°ã®ç²¾åº¦ãå¤å°ä¸ããã®ã§ã¯ãªããã¨æãã¾ãã