fluent-plugin-uniqcountãå ¬éãã¾ãã
æ¦è¦
fluent-plugin-uniqcountã¯ãæå®æéç¯å²ã«ããã¦ãè¨å®ã§æå®ããå±æ§ãã¨ã«ã¤ãã³ãçºç件æ°ãã¦ãã¼ã¯ã«ã«ã¦ã³ãããçµæã®ä¸ä½N件ãå®æåºåããfluentdãã©ã°ã¤ã³ã§ããSQLã§ããã¨ããã®:
SELECT key1, COUNT(key2) AS key2_count, COUNT(DISTINCT(key2)) AS key2_uniq_count FROM records WHERE time BETWEEN T1 AND T2 GROUP BY key1 ORDER BY key2_count DESC LIMIT 0, N;
ã«ç¸å½ãã¾ãã
fluent-plugin-uniqcountã使ãã¨ã
- ç´è¿ã®60ç§ã§ãã¦ãã¼ã¯ã¢ã¯ã»ã¹ãæãå¤ãã£ãURLããã100件ããªã¹ãã¢ãã
- ç´è¿ã®1æéã§ã¦ãã¼ã¯ã¢ã¯ã»ã¹ãæãå¤ãçºçãããã¦ã¼ã¶ãããã20人
ãªã©ãéè¨å¯è½ã§ãã
ç¹å¾´
fluent-plugin-uniqcountã¯ãåä½æéãããã®ãã©ãã£ãã¯éãNãéè¨å¯¾è±¡åºéé·(è¨å®é ç®: list*_span)ãTã¨ããã¨ãã«ã以ä¸ã®ãããªç¹å¾´ãåãã¦ãã¾ãã
- ã¤ãã³ãï¼ä»¶ãããã®ãã©ã°ã¤ã³ã¸ã®å ¥åã³ã¹ã㯠O(logNT)ãåä½æéãããã 㨠O(N logNT)ã
- ãã©ã°ã¤ã³ã®åºåã³ã¹ã㯠O(logNT)ã
- æ¶è²»ã¡ã¢ãªã¯ O(NT)ã
- ã¦ãã¼ã¯ã«ã¦ã³ãã§ããï¼F5ãªãã¼ããªã©ã§ã©ã³ãã³ã°ãè·³ãä¸ãããªãï¼ã
- ç´è¿(Tç§åãç¾å¨)ã®éè¨çµæãå¯è½ã§ããï¼éè¨å¯¾è±¡åºéãæéçµéã¨ã¨ãã«ã¹ã©ã¤ãããï¼ã
- ä¸ã¤ã®éè¨åä½ãããªã¹ããã¨å¼ã³ãç¬ç«ããè¨å®ãæã¤ä»»æã®æ°ã®ãªã¹ããä¿æãããã¨ãã§ããï¼æ大10åã¾ã§ï¼ã
- fluentdã®ã¤ãã³ãåä¿¡æå»ã§ã¯ãªããã¬ã³ã¼ãå ã®ã¿ã¤ã ã¹ã¿ã³ããåºã«éè¨ãè¡ããã¨ãã§ããï¼æªè¨å®æã¯ã¤ãã³ãåä¿¡æå»ãç¨ããï¼ã
- appãµã¼ãããfluentdã«ã¤ãã³ããå°éããæéã®ããããå¸åãããããéè¨å¯¾è±¡åºéã®çµç¹ãç¾å¨æå»ããéå»ã¸ãªãã»ãããããã¨ãã§ããï¼list*_offset: 転éã«é¢ä¸ããfluentdã®flush_intervalå¤ãèæ ®ãã¦è¨å®ããï¼
è¨å®
é ç®
- list*_label : åºåã¬ã³ã¼ãã«ä»ä¸ãããã©ãã«æåå
- list*_time : ã¤ãã³ãæå»ãã£ã¼ã«ãåï¼UNIXã¿ã¤ã ã¹ã¿ã³ãå½¢å¼ - çç¥æã¯fluentdã«ãããã¤ãã³ãåä¿¡æå»ã使ãããï¼
- list*_key1 : ã°ã«ã¼ãã³ã°ã«ç¨ãããã£ã¼ã«ãå
- list*_key2 : ã«ã¦ã³ã対象ãã£ã¼ã«ãå
- list*_span : éè¨å¯¾è±¡åºéé·ï¼ç§ï¼
- list*_offset : éè¨å¯¾è±¡åºéçµç¹ã®ãªãã»ããå¤ï¼ç§ï¼
- list*_out_tag : åºåã¬ã³ã¼ãã«ä»ä¸ãããã¿ã°
- list*-out_num : åºå件æ°ï¼ä¸ä½N件ï¼
- list*_out_interval : åºåééï¼ç§ï¼
è¨å®ä¾
å ¥åã¤ãã³ãã以ä¸ã®ãããªãã©ã¼ããããæã£ã¦ããã¨ãã
site.access_log: {"at":1367820029,"uri":"http://www.careerlink.vn/","remote_ip":168364289} ...
次ã®ãããªè¨å®ã§ãåspanã«ãããURLã©ã³ãã³ã°ãå¾ããã¨ãã§ãã¾ãã
<match site.access_log> type uniq_count list1_label min list1_time at list1_key1 uri list1_key2 remote_ip list1_span 60 list1_offset 3 list1_out_tag trends.min list1_out_num 5 list1_out_interval 1 list2_label day list2_time at list2_key1 uri list2_key2 remote_ip list2_span 86400 list2_out_tag trends.day list2_out_num 5 list2_out_interval 10 </match>
åºåä¾ã¯ä»¥ä¸ã®ããã«ãªãã¾ãã
trends.min: {"label":"min","ranks":[{"key1":"http://www.careerlink.vn/","rank":0,"key2_uniq_count":13},{"key1":"http://www.careerlink.vn/file/71ca7183087cce9f04fc559ce37738e9","rank":1,"key2_uniq_count":12}, { ... }, ...],"at":1367840734} trends.day: {"label":"day","ranks":[{"key1":"http://www.careerlink.vn/","rank":0,"key2_uniq_count":8034},{"key1":"http://www.careerlink.vn/file/065d88676460f47d99ec59263c650f54","rank":1,"key2_uniq_count":7735},{"key1":"http://www.careerlink.vn/file/71ca7183087cce9f04fc559ce37738e9","rank":2,"key2_uniq_count":7266}, { ... }, ... ],"at":1367840734}
TODO
- gemå
- çã¡ã¢ãªåã®ããããªãã¸ã§ã¯ãã®çææ°ããã£ã¨æ¸ãããã
注æç¹
æµããããã©ãã£ãã¯ããè¨å®ã«ãã£ã¦ã¯ãããªãã®ãªã½ã¼ã¹ï¼CPUãã¡ã¢ãªå
±ã«ï¼ãé£ãã¨æãã¾ãããã®ãã©ã°ã¤ã³ãå®éã«ä½¿ç¨ããéã¯ãfluentdã®ããã»ã¹ãç¬ç«ãããã¡ã¤ã³ã®fluentdããã»ã¹ããã¤ãã³ããforwardãããå¦ççµæãã¾ãã¡ã¤ã³ããã»ã¹ã«æ»ãã¦ããã®ãããã¨æãã¾ãã
ãªããæå
ã§ã¯fluentdããã»ã¹ã2GBè¶
ã®ã¡ã¢ãªãæ¶è²»ããç¶æ
ã§ãå®å®ãã¦ç¨¼åãã¦ãã¾ãï¼ruby 1.9.3p194ï¼ã