SIGIR 2011ã®DOM Based Content Extraction via Text Densityããã·ã³ãã«ãªã¢ã«ã´ãªãºã ãªããè¯ããããªçµæã示ãã¦ããã®ã§ãèè ã®ã³ã¼ããæ¹å¤ãã¦SWIGã§Perlã¨Pythonã®bindingãä½ã£ãã ä¸æãªè±æã¡ã¼ã«ã«ãé¢ããããã³ã¼ãã®å©ç¨ãå¿«ãèªãã¦ä¸ãã£ãFei Sunããããããã¨ããããã¾ãï¼ cpp-ContentExtractionViaTextDensity - GitHub ããã¯ä½ããããã®ãã¨ããã¨ãã¿ã¤ãã«ã©ãããDOMããªã¼ä¸ã§Text Densityã¨ããææ¨ãç¨ãã¦ã¦ã§ããã¼ã¸ã®æ¬ææ½åºãè¡ããã®ãæ©æ¢°å¦ç¿ã¨ãã§ã¯ãªããåç´ã«æ±ºããããæ¹æ³ã§è¨ç®ãããText Densityãç¨ããã ãã®ã·ã³ãã«ãªã¢ã«ã´ãªãºã ã§ããã Text Densityã¯DOMãã¼ããã¨ã«è¨ç®ãããã·ã³ãã«ã«ããã¹ãã®æåæ°ãã¿
{{#tags}}- {{label}}
{{/tags}}