ãã©ã¤ï¼ããã«é åï¼ç°¡æ½ãã¼ã¿æ§é ï¼ã¨ STL ã³ã³ããã®æ¯è¼ï¼ææ°çï¼
[2011/11/30 æ´æ°; std::(unordered_)map でメモリ使用量を見積もる - ny23の日記ã«å¾ãï¼STL ã³ã³ããã®ã¡ã¢ãªä½¿ç¨éãè¨æ¸¬]
[2011/02/21 æ´æ°: marisa-trie 0.1.3; 発表資料(Prefix/Patricia Trie の入れ子による辞書圧縮) - やた@はてな日記 ã«ã¦ãã®è¨äºã®å®é¨çµæãå¼ç¨ããã¦ããã®ã§ï¼ä»¥å¾ååæ´æ°ããªããã¨ã¨ããï¼ãªãï¼marisa-trie 㯠検索時間が短くなりました - やた@はてな日記 ã«ããããã«ï¼marisa-0.2.0-beta3 以éã§ã¯ããã«æ¤ç´¢ãéããªã£ã¦ãã¾ãï¼]
[2011/02/18 æ´æ°: marisa-trie ã®ä»æ§å¤æ´ã«ä¼´ãï¼è¿½è¨ã®è¨è¿°ãæ´åæ§ãåããããå¤æ´; ææ°çã§ã¯æªç¢ºèª]
id:s-yata ããã marisa-trie ãå
¬éãããã®ã§ï¼ä¾ã«ãã£ã¦ãã³ããã¼ã¯ãåã£ãçµæãå
¬éãã¦ããã¾ãï¼ä»¥åï¼ux-trie ã®ãã°çµ¡ã¿ã§ã³ã¼ããèªãã ã¨ãï¼TAIL é¨ã§ãã©ã¤ã®æ®µæ°ããã£ã¨å¢ããã¨ã©ããªãã®ããªã¨æã£ã¦ããã¨ããã ã£ãã®ã§ï¼é常ã«è¯ãã¿ã¤ãã³ã°ã§ããï¼marisa-trie ã§ãã©ã¤ãæ§ç¯ããéã®ãã©ã¡ã¿ã¯ï¼ãã©ã¤ã®æ°ï¼build ã®ç¬¬ä¸å¼æ°ï¼ã®ã¨ããããå¤ãã¦ãã¾ããï¼Build ã¯æ§ç¯æéï¼Search ã¯è¾æ¸é ï¼Search* ã¯ã©ã³ãã é ã®æ¤ç´¢æéï¼Wikipedia-en 㯠20090929 ã®ãã®ï¼
åã
ã®ã©ã¤ãã©ãªã¸ã®ãªã³ã¯ãç´°ãã表è¨ã®æå³ï¼marisa-trie 以å¤ã®çµæã«é¢ããèå¯ã¯ä»¥ä¸ã®è¨äºãåç
§ï¼
å人çã«ã¯ marisa-trie ã¯å§ç¸®çã®é«ããããï¼ãã®æ¤ç´¢é度ã®éãã«ããé
åãæãã¾ãï¼
[追è¨] marisa-trie 㧠patricia trie -> prefix tree ã«ããã¨ï¼Wikipedia-en ã§ã¯ãã©ã¤ã®ãµã¤ãºã1-2å²å¢ãï¼æ¤ç´¢é度ã1-2å²è½ã¡ã¾ãï¼éµä¾¿çªå·è¾æ¸ã®æ¹ã§ã¯ï¼ãã©ã¤ã®ãµã¤ãºã¯ã»ã¼åãã§æ¤ç´¢é度㯠prefix tree ã®æ¹ãéã«å
ãã«é«éï¼ï¼æ¬¡ã«ï¼TAIL é¨ãæåååããªãã§ãã®ã¾ã¾ãã©ã¤ã§ä¿æããã¨ï¼ï¼Wikipedia-en ã§ã¯ï¼ãã©ã¤ã®æ°ãä¸ã¤ã®æã®ã¿ãµã¤ãºã大å¹
ã«å¢ãï¼tx ã¨åããããï¼ï¼æ¤ç´¢é度ãç¹ã«ã©ã³ãã æ¤ç´¢é ã§å¤§ããï¼1/2 ãããã¾ã§ï¼ä½ä¸ãã¾ãï¼ãã®è¾ºãï¼ã»ã¼äºæ³éã*1ã®æåã§ãï¼
====================================================================== Wikipedia-en è¨äºã¿ã¤ãã« | Build | Search | Search* | Size [bytes] #keys=6,890,514 | | | | 136,297,337 ====================================================================== sorted keys Darts 0.32 | 14.283s | 0.527s | 6.694s | 552,547,920 darts-clone 0.32g | 6.575s | 0.625s | 5.946s | 258,735,104 darts-clone 0.32g v:1 | 15.917s | 0.764s | 6.227s | 95,080,448 *darts-clone 0.32e5 | 14.532s | 0.924s | 5.890s | 113,360,140 *darts-clone 0.32e5 v:1 | 13.996s | 0.852s | 5.562s | 87,122,880 darts-clone 0.32e5 huge| 4.128s | 0.554s | 5.920s | 147,491,712 *DASTrie 1.0 -c | NA | NA | NA | NA DASTrie 1.0 | 4.951s | 1.012s | 6.212s | 157,593,742 *doar 0.0.13 static | 8.684s | 0.672s | 5.664s | 123,000,249 -------------------------------------------------------------------- *doar 0.0.13 dynamic | +3.964s | 0.670s | 5.757s | +123,000,249 dda | 3.222s | 0.612s | 7.191s | 517,091,328 --------------------------------------------------------------------- map k:char* | 2.905s | 2.122s | 17.352s | ~515,663,184 map k:string | 10.482s | 6.041s | 25.479s | >515,663,184 unordered_map k:char* | 6.271s | 2.013s | 3.638s | ~460,539,072 unordered_map k:string | 7.303s | 2.182s | 4.540s | >460,539,072 -------------------------------------------------------------------- marisa 0.1.3 #tries=1 | 23.118s | 4.087s | 16.491s | 49,350,092 marisa 0.1.3 #tries=2 | 25.411s | 6.921s | 21.707s | 37,743,122 marisa 0.1.3 #tries=3 | 25.786s | 7.524s | 22.945s | 35,767,947 marisa 0.1.3 #tries=10| 26.206s | 7.841s | 23.633s | 34,802,056 tx 0.18 | 18.611s | 24.969s | 62.604s | 87,520,717 ux 0.1.2 | 46.515s | 45.921s | 67.670s | 49,134,003 ux 0.1.2 isTailUX=false| 16.744s | 30.197s | 49.810s | 105,423,225 ===================================================================== randomized keys doar 0.0.13 dynamic |+24.902s | 1.952s | 5.645s | +123,000,249 dda | 10.017s | 3.647s | 8.589s | 517,091,328 -------------------------------------------------------------------- map k:char* | 18.576s | 3.637s | 21.164s | ~515,663,184 map k:string | 27.897s | 8.417s | 30.713s | >515,663,184 unordered_map k:char* | 6.360s | 3.754s | 3.646s | ~460,539,072 unordered_map k:string | 7.302s | 4.605s | 4.549s | >460,539,072 -------------------------------------------------------------------- marisa 0.1.3 #tries=1 | 32.635s | 5.381s | 16.232s | 49,350,092 marisa 0.1.3 #tries=2 | 34.408s | 8.400s | 21.977s | 37,743,122 marisa 0.1.3 #tries=3 | 34.884s | 9.232s | 22.945s | 35,767,947 marisa 0.1.3 #tries=10| 35.405s | 9.556s | 23.660s | 34,802,056 tx 0.18 | 38.718s | 26.434s | 62.358s | 87,520,717 ux 0.1.2 | 63.228s | 47.239s | 68.075s | 49,134,003 ux 0.1.2 isTailUX=false| 33.159s | 31.547s | 50.461s | 105,423,225 ====================================================================== ====================================================================== éµä¾¿çªå·è¾æ¸ | Build | Search | Search* | Size [bytes] #keys=118,821 | | | | 950,568 ====================================================================== sorted keys Darts 0.32 | 0.039s | 0.003s | 0.012s | 2,156,576 darts-clone 0.32g | 0.038s | 0.003s | 0.010s | 1,069,056 darts-clone 0.32g v:1 | 0.020s | 0.003s | 0.007s | 106,496 *darts-clone 0.32e5 | 0.171s | 0.004s | 0.013s | 1,154,676 *darts-clone 0.32e5 v:1 | 0.165s | 0.004s | 0.011s | 650,116 darts-clone 0.32e5 huge| 0.141s | 0.004s | 0.013s | 1,363,656 *DASTrie 1.0 -c | 2.226s | 0.010s | 0.018s | 1,248,609 DASTrie 1.0 | 2.354s | 0.010s | 0.019s | 1,411,783 *doar 0.0.13 static | 0.064s | 0.004s | 0.013s | 1,204,561 -------------------------------------------------------------------- *doar 0.0.13 dynamic | +0.091s | 0.004s | 0.013s | +1,204,561 dda | 0.023s | 0.004s | 0.014s | 2,129,920 -------------------------------------------------------------------- map k:char* | 0.043s | 0.028s | 0.074s | ~7,604,544 map k:string | 0.137s | 0.072s | 0.147s | >7,604,544 unordered_map k:char* | 0.023s | 0.010s | 0.027s | ~6,653,976 unordered_map k:string | 0.039s | 0.013s | 0.038s | >6,653,976 -------------------------------------------------------------------- marisa 0.1.3 #tries=1 | 0.083s | 0.033s | 0.059s | 232,729 marisa 0.1.3 #tries=2 | 0.083s | 0.034s | 0.060s | 232,111 marisa 0.1.3 #tries=3 | 0.083s | 0.034s | 0.060s | 232,270 marisa 0.1.3 #tries=10| 0.083s | 0.034s | 0.060s | 233,404 tx 0.18 | 0.096s | 0.090s | 0.129s | 222,888 ux 0.1.2 | 0.092s | 0.158s | 0.187s | 220,209 ux 0.1.2 isTailUX=false| 0.092s | 0.157s | 0.187s | 222,165 ===================================================================== randomized keys *doar 0.0.13 dynamic | +0.104s | 0.009s | 0.013s | +1,204,561 dda | 0.036s | 0.010s | 0.013s | 2,129,920 -------------------------------------------------------------------- map k:char* | 0.068s | 0.035s | 0.085s | ~7,604,544 map k:string | 0.143s | 0.092s | 0.168s | >7,604,544 unordered_map k:char* | 0.029s | 0.023s | 0.027s | ~6,653,976 unordered_map k:string | 0.049s | 0.036s | 0.039s | >6,653,976 -------------------------------------------------------------------- marisa 0.1.3 #tries=1 | 0.121s | 0.036s | 0.059s | 232,729 marisa 0.1.3 #tries=2 | 0.121s | 0.037s | 0.060s | 232,111 marisa 0.1.3 #tries=3 | 0.121s | 0.037s | 0.060s | 232,270 marisa 0.1.3 #tries=10| 0.121s | 0.037s | 0.060s | 233,404 tx 0.18 | 0.168s | 0.097s | 0.129s | 222,888 ux 0.1.2 | 0.174s | 0.165s | 0.187s | 220,209 ux 0.1.2 isTailUX=false| 0.174s | 0.165s | 0.187s | 222,165 ===================================================================== (compiled with gcc 4.6.0 20110212, -O2 -march=core2 -m64)
*1:ãã©ã¤ã®æ°ãä¸ã¤ã«ãã¦ï¼TAIL é¨ã®å¦çãï¼é度ã®éãï¼æååã®ç §åã®ã¿ã§æ¸ã¾ããå ´åã«æ¤ç´¢é度ã¯æéã¨ãªãï¼TAIL é¨ã®ãã©ã¤ã®æ°ãå¢ããã°å¢ããã»ã©ï¼TAIL ã§æååç §åããé¨åãæ¸ã£ã¦ããã®ã§ï¼TAIL ãªãã®åç´ãªãã©ã¤ã®æ¤ç´¢é度ã«è¿ã¥ãï¼