ãæ¬ç¨¿ã®ãµã³ãã«ã³ã¼ãã¯GitHubã«ãã¢ãããã¦ãã¾ããä½µãã¦ãåç §ãã ããã
ã¯ããã«
ãHBaseã¯å¼·ãä¸è²«æ§ãæã¡ãã¢ãããã¯ãªã¤ã³ã¯ãªã¡ã³ãå¦çãå¯è½ã§ãããã¨ã¯ååã¾ã§ã«èª¬æãã¦ãã¾ãããä»åã¯ãã®ç¹æ§ãå©ç¨ãã¦ã大è¦æ¨¡ãªã¢ã¯ã»ã¹ã«ã¦ã³ã¿ãä½æãã¦ããã¾ãã
ãHBaseã大è¦æ¨¡ãªã¢ã¯ã»ã¹è§£æãµã¼ãã¹ã¨ãã¦å©ç¨ãã¦ãããã®ã®ä¸ã§ã¯ãFacebookã®ãã¼ã¸ã¤ã³ãµã¤ããæåã§ãã
ãä»åç´¹ä»ããç°¡æã¢ã¯ã»ã¹è§£æãµã¼ãã¹ã¯ãFacebookã®ãã¼ã¸ã¤ã³ãµã¤ãã¨æ¯ã¹ãã¨ç°¡æãªãã®ã§ãããHBaseã«ããããã¾ãã¾ãªãã¯ããã¯ãå«ã¾ãã¦ãã¾ãã®ã§ãããããç´¹ä»ãã¦ããããã¨æãã¾ãã
対象èªè
- HBaseã使ã£ã¦ã¿ãããã©ãã©ã使ã£ãããããåãããªãæ¹
- MySQLãªã©ã®RDB以å¤ã®ãã¼ã¿ãã¼ã¹ã使ã£ã¦ã¿ããæ¹
è¦ä»¶å®ç¾©
ãããã§ã¯ããã£ããè¦ä»¶å®ç¾©ããã¦ããã¾ãããã以ä¸ã®ãããªæ©è½ãæã¤ç°¡æã¢ã¯ã»ã¹è§£æãµã¼ãã¹ãä½ã£ã¦ãããã¨ã«ãã¾ãã
- URLãã¨ã«ã¢ã¯ã»ã¹ããªã¢ã«ã¿ã¤ã ã«ã«ã¦ã³ãã§ãã
- ã¢ã¯ã»ã¹ã¯ãã¢ã¯ãªã¼ï¼æ¯æï¼ããã¤ãªã¼ï¼æ¯æ¥ï¼ããã¼ã¿ã«ã§åå¾ã§ãã
- ã¢ã¯ã»ã¹ãåå¾ããéã«ãåå¾ããæéã®ç¯å²ãæå®ã§ãã
- æå®ãããã¡ã¤ã³ãå ±éã«æã¤ã®ãã¹ã¦ã®ã¢ã¯ã»ã¹ãåå¾ã§ãã
è«çè¨è¨
ãç°¡æã¢ã¯ã»ã¹è§£æãµã¼ãã¹ã®ERå³ã¯ã以ä¸ã®ããã«ãªãã¾ãã主ãã¼ã¨ãªãURLãããã¡ã¤ã³ãã¨ããã¹ãã«åãã¦ãã¾ãã

ç©çè¨è¨
ã次ã«ç©çè¨è¨ã«å ¥ã£ã¦ããã¾ããç©çè¨è¨ã¯ããã¯ã¨ãªè¨è¨ããããå¾ã«ãHBaseä¸ã«ã©ããããã³ã°ããããè¨è¨ãããã¹ãã¼ãè¨è¨ããè¡ãã¾ãã
ã¯ã¨ãªè¨è¨
ãã¾ããã¯ã¨ãªããèãã¦ããã¾ããããè¦ä»¶å®ç¾©ããã¯ã¨ãªãèãã¦ããã¨ã以ä¸ã®ãããªã¡ã½ãããå®è£ ããã°è¯ãã¨æãã¾ãã
// ã¢ã¯ã»ã¹ãã«ã¦ã³ããã void count(String domain, String path, int amount) throws IOException; // ã¢ã¯ãªã¼(æ¯æ)ã®ã¢ã¯ã»ã¹ãåå¾ãã List<Access> getHourlyCount(String domain, String path, Calendar startHour, Calendar endHour) throws IOException; // ãã¤ãªã¼(æ¯æ¥)ã®ã¢ã¯ã»ã¹ãåå¾ãã List<Access> getDailyCount(String domain, String path, Calendar startDay, Calendar endDay) throws IOException; // ãã¼ã¿ã«ã®ã¢ã¯ã»ã¹ãåå¾ãã long getTotalCount(String domain, String path) throws IOException;
ãgetHourlyCountãgetDailyCountãgetTotalCountã¯ããããã¢ã¯ãªã¼ããã¤ãªã¼ããã¼ã¿ã«ã®ã¢ã¯ã»ã¹æ°ãåå¾ãããã®ã§ãããã ããã¯ã¨ãªãã¿ã¼ã³ã¨ãã¦ã¯ããããåæã«åããã¨ã¯ãªãè¨è¨ã«ãªã£ã¦ãã¾ãã
ãã¾ããgetHourlyCountãgetDailyCountãgetTotalCountã®å¼æ°ã®pathã¯nullã許容ãããã®å ´åã¯æå®ãããã¡ã¤ã³ãå ±éã«æã¤ãã¹ã¦ã®ã¢ã¯ã»ã¹æ°ãåå¾ã§ããããã«ãã¾ãã
ãgetHourlyCountãgetDailyCountã®å¼æ°ã§ããstartXXXãendXXXã¯ãåå¾ããã¢ã¯ã»ã¹ã®æéã®ç¯å²ã表ãã¦ãã¾ããgetHourlyCountã®å¼æ°ã¯å¹´ææ¥æãæå®ããgetDailyCountã¯å¹´ææ¥ãæå®ããããã«ãã¾ãã
ãAccessã¯ã©ã¹ã¯ä»¥ä¸ã®ããã«ãªã£ã¦ãã¾ãã
ããæéãã«é¢ãã¦ã¯ãå¼æ°åæ§ã«getHourlyCountã§åå¾ãããã®ã¯å¹´ææ¥æã¾ã§æ ¼ç´ããã¦ãããgetDailyCountã§åå¾ãããã®ã¯å¹´ææ¥ã¾ã§æ ¼ç´ããã¾ãã
public class Access { // æé private Date date; // ãã¡ã¤ã³ private String domain; // ãã¹ private String path; // ã¢ã¯ã»ã¹æ° private long count; // ... setterãgetterã¯çç¥ }
ã¹ãã¼ãè¨è¨
ãããã§ã¯ãå®éã«ã©ã®ããã«HBaseã«æ ¼ç´ãã¦ããã®ããèãã¦ããã¾ããããå ±éã®ãã¡ã¤ã³ãæã¤ãã¹ã¦ã®ã¢ã¯ã»ã¹æ°ãåæã«åå¾ã§ããå¿ è¦ãããã¾ãã
ãããã§ããªãã¼ã¹ãã¡ã¤ã³ã¨ãããã¯ããã¯ãç´¹ä»ãã¾ãããã®ãã¯ããã¯ã¯ãèªãã§åã®ãã¨ããã¡ã¤ã³ãå対ã«ãã¾ãã
ãä¾ãã°ã"blog.ameba.jp"ã¨ãããã¡ã¤ã³ãå対ã«ãã¦"jp.ameba.blog"ã¨ããããã«ãã¾ãããã®ããã«ãããã¨ã§ãjp.amebaããã¬ãã£ãã¯ã¹ã¨ãã¦Scanããã¨ã"jp.ameba.blog"ã"jp.ameba.pigg"ã¨ãã£ããããªå ±éã®ãã¡ã¤ã³ãæã¤ã¢ã¯ã»ã¹æ°ãåå¾ã§ããããã«ãªãã¾ãã
ãä»åã¯ããã®ãªãã¼ã¹ãã¡ã¤ã³ã¨pathãé£çµãããã®ãRowKeyã¨ãã¾ããã¯ã¨ãªè¨è¨ãããã¢ã¯ãªã¼ããã¤ãªã¼ããã¼ã¿ã«ã®ã¢ã¯ã»ã¹æ°ã¯åæã«åå¾ããã¾ããããªã®ã§ãColumnFamilyãå¥ã«ãã¦ãããããä¿åãããã¨ã«ãã¾ãã
ãããã¾ã§ã«èª¬æããéããColumnFamilyãã¨ã«å¥ã®ä¿ååä½ã«ãªããããã¯ã¨ãªãã¿ã¼ã³ãã¾ã£ããç°ãªããã¼ã¿ãåãããã¨ã§ããã£ã¹ã¯I/Oã®åãåããå¯è½ã§ãã
ãã¢ã¯ãªã¼ããã¤ãªã¼ããã¼ã¿ã«ã®ColumnFamilyãããããã"h"ã"d"ã"t"ã¨ãã¾ãã
ãããã¯ååã説æãã¾ããããColumnFamilyãColumnã¯åã¨ã³ããªã«ä¿åããããããçãã»ããå¹ççã«ãªãããã§ãã
ãColumnã«ã¯ãã¢ã¯ãªã¼ã®å ´åã¯å¹´ææ¥æãããã¤ãªã¼ã®å ´åã¯å¹´ææ¥ãå ¥ãã¾ãããã¼ã¿ã«ã®å ´åã¯Columnã使ç¨ããªãã®ã§ç©ºã«ãã¦ãã¾ãã¾ãã
ãã¾ããTimestampã¯åååæ§ããã¼ã¿ã追å ã»æ´æ°ããã¨ãã®æéã使ããã¨ã«ãã¾ããããããã¾ã¨ããã¨ã以ä¸ã®ãããªã¹ãã¼ãã«ãªãã¾ãã
RowKey | ColumnFamily | Column | Timestamp | Value |
---|---|---|---|---|
(reverse domain)-path | "h" | yyyyMMddHH | timestamp | counter |
(reverse domain)-path | "d" | yyyyMMdd | timestamp | counter |
(reverse domain)-path | "t" | "" | timestamp | counter |