ã¯ããã¾ãã¦ãä»å¹´ã®5æã«å ¥ç¤¾ããåé@ããããã¼ã ã§ãã
å ¥ç¤¾ãã¦ããã¯ããªããªã大å¤ãªãã¨ãå¤ãã§ãããæè¿ã¯ãé 好ããéã¾ã£ã¦ææãã飲ã¿åã ãåéä¼ããªããã®ãçºè¶³ãã¦ãä»äºé¢ã§ãä»äºä»¥å¤ã®é¢ã§ãå¯åº¦ã®é«ãæ¯æ¥ãéããã¦ãã¾ãï¼
ãã¦ãåã¯ããããããã¼ã æå±ã¨ãããã¨ã§ãæ®æ®µã¯ã¬ã·ãããããããã¦ã¼ã¶ã®æºè¶³åº¦ãä¸ããããã«ã ã¯ãã¯ãããã®æ¤ç´¢ã¾ããã«ã¤ãã¦ããããããªéçºãè¡ã£ã¦ãã¾ãã ä¸æ¹ã§ãã¦ã¼ã¶ã®ãããã欲æ±ãã«ã¤ãã¦æ·±ãç¥ãããã«ã大è¦æ¨¡ãªãã¼ã¿è§£æãè¡ãã欲æ±ã®åæãè¡ãæ©ä¼ãå¢ãã¦ãã¾ããã
ã¨ããããã¯ãã¯ãããã®ãã°ã¯è¨å¤§ãªæ°ãããã®ã§ãä¸å£ã®ãã¼ã¿è§£æã¨è¨ã£ã¦ãé常ã®ãããå¦çã ã¨éã«åããªãããã åæ£å¦çç°å¢ã®å¿ è¦æ§ãé«ã¾ã£ã¦ãã¾ããã ããã§ãã¾ãã¯æ軽ã«è©¦ããåæ£å¦çã®çéã¨ãããã¨ã§ãæè¿ã§ã¯Hadoopã使ã£ããã¼ã¿è§£æç°å¢ãæ´åãã¦ãã¾ãã
ãããªä¸ãã¡ããã©tech lunchã§çºè¡¨ã®é çªãåã«åã£ã¦ããã®ã§ãããæ©ä¼ãªã®ã§ ãããããHadoopã£ã¦ä½ï¼ãã¨ãããã¨ããMap & Reduceãè¡ã£ã¦ããã¨ããåããã»ã¹ã¯ä½ããã¦ããã®ï¼ããªè©±ããã¢ãå«ããªããçºè¡¨ãã¦ã¿ã¾ããã ä»åã¯ããã®ã¨ãã®å 容ã®è³æã¨è³ªçå¿çã®å 容ãå ±æãããã¨æãã¾ãã
çºè¡¨è³æã¯ãã®ãããªãã®ãå©ç¨ãã¾ããã
[slideshare id=1992982&doc=techlife-2009-09-11-090913193034-phpapp01]
ãã®å¾ã¯ããã®ãããªè³ªçå¿çãè¡ããã¾ããã
HDFSã®NameNodeãããã«ããã¯ã«ã¯ãªãããªãã®ãï¼
NameNodeã¯é害ãèµ·ããã¨ãHDFSã®ã¯ã©ã¹ã¿å ¨ä½ãå©ç¨ä¸è½ã«ãªãã®ã§ãSPOFã¨ãã観ç¹ã§èããã¨ãæ§æä¸ã©ããã¦ãããã«ããã¯ã«ãªãå¾ã¾ãã ãã ããNameNodeãæ±ãã¡ã¿ãã¼ã¿ãä¿è·ã§ããã°ãNameNodeã«é害ãèµ·ãã¦ã復æ§ã§ããããã次ã®ãããªæ¹æ³ãæå±ããã¦ãã¾ãã
- ã¡ã¿ãã¼ã¿ã«ã¤ãã¦ã¯ãRAIDãçµãã§å¤éã«ãã£ã¹ã¯ã«æ¸ãè¾¼ãã
- ã¾ãã¯ãNFSãã¦ã³ããããé åã«ã¡ã¿ãã¼ã¿ãæ¸ãè¾¼ãã
- SecondaryNameNodeã¯NameNodeã®ã¡ã¿ãã¼ã¿ã®ããã¯ã¢ãããå®æçã«ã¨ã£ã¦ãããããNameNodeã¨SecondaryNameNodeãç©ççã«å¥ãã¼ãã«åãã¦éç¨ããã
ã¾ããNagiosãGangliaã§ã¢ãã¿ãªã³ã°ãè¡ããã¨ãå¯è½ã§ãããããé害ãæ¤ç¥ãããã¨ãå¯è½ã§ãã
Hadoop MapReduceã®å¦çã³ã¼ãã¯masterã«ã ãç½®ãã¦ããã°é å¸ãããã®ãï¼
å¦çã³ã¼ãã¯Map&Reduceå®è¡æã«masterããslaveã«é å¸ãããã¨ãã§ãã¾ãããã¡ãããããããrsyncãªã©ã§é å¸ãã¦ããæ¹æ³ãåããã¨ãåºæ¥ã¾ãã
HDFSã®å®å®æ§ã¯ï¼
ã¬ããªã±ã¼ã·ã§ã³ã®æ°ãå¢ãããã¨ã§å®å®ç¨¼åããã¾ãã ããã¯ã¢ãã失æãæ³å®ãã¦ãã¬ããªã±ã¼ã·ã§ã³æ°ã¯3以ä¸ãæ¨å¥¨ããã¦ãã¾ããã¾ããHDFSã«å¯¾ãã¦ãã¡ã¤ã«ã·ã¹ãã ãã§ãã¯ãè¡ããã¨ãå¯è½ã§ãã
Hadoop MapReduceã¨HDFSãåããæ¹ããªã½ã¼ã¹ãå¹ççã«ä½¿ããã®ã§ã¯ï¼
ããããçºæ³ããããã¨æãã¾ãããHadoopéçºé£ã®æ¹éã¨ãã¦ãåºæ¬çã«ã¯æ¨æºçãªè¨å®ã®ä½¿ç¨ãæ¨å¥¨ãã¦ããããã§ãã æ¨æºçãªè¨å®ã¯ãYahooãFacebookãªã©å¤§è¦æ¨¡ã«å©ç¨ããã¦ããã±ã¼ã¹ããã¼ã¹ã«ãã¦è¨å®ããããã®ãªã®ã§ãããã«ç¿ãã®ãçµæçã«ä¸çªå®å®ãã¦åä½ããã®ã§ãããã
Taskãå²ãæ¯ãã®ã«åªå 度ãä»ããããã®ãï¼
è¤æ°Jobãä¸ããã¨ãã«ãåªå 度ãä»ãããã¨ãã§ãã¾ãã
UU測å®ã®ãµã³ãã«ã³ã¼ãã§ãè¤æ°ããã»ã¹åå¨ããã¯ãã®Reduceãããã·ã¥ãããã®ãããªãªãã¸ã§ã¯ããã©ããã¦æã¦ãã®ï¼
ããã·ã¥ãããã®ãããªãã¼ãã¨ã«å¦çãè¡ããªãã¸ã§ã¯ããæã¤ããã«ã¯ãReducerã¯ãã¼ãã¨ã«åããã¼ããå¦çãè¡ã£ã¦ããå¿ è¦ãããã¾ãããã®ã¨ããMapã®åºåãåããã¼ãã¨ã«åãã°ã«ã¼ãã«å±ããããã«åå²ã§ããã°ãReduceã¯åå²å¯è½ã¨ãªãã¾ãã
ããã§ã®ããMapã®åºåããã½ã¼ãããReduceã«æ¸¡ãããã§ã¼ãºã¯ãShuffleãããReduceã¸ã®å ¥åããã¼ã«åºã¥ãã¦ã°ã«ã¼ãåãã¦ã¾ã¨ããããã§ã¼ãºããSortããã§ã¼ãºã¨å¼ã³ã¾ãã Hadoopã§ã¯ShuffleãSortã¯å®å ¨ã«é è½ããã¦ããã®ã§éçºè ããããã®ã³ã¼ããæ¸ãã±ã¼ã¹ã¯ããã¾ããã
ã¤ã¾ããä»åã®ãããªMapã®åºåãkeyãã¨ã«åå²ããã¦ããå ´åã¯ãShuffle, Sortã«ãã£ã¦Mapã®åºåãåå²ãã°ã«ã¼ãåãã¦Reduceãå¦çã§ããããã«ãªãã®ã§ãã¾ãæ±ããã¨ãã§ãã¾ãã
HDFSãç»åãµã¼ããªã©ã«å©ç¨ã§ããã®ãï¼
HDFSããã®ã¬ã¹ãã³ã¹ã¯ç¹ã«éãããã§ã¯ãªãã転éé度ãããã«ããã¯ã«ãªãã®ã§ãç»åãµã¼ãã«ã¯åãã¦ãã¾ãããHDFSã¯ããã¼ã¿ã®è¡ãæ¥ãããã»ã©èµ·ãããªãããã°ãã¼ã¿ã®ãããªãã®ã®ä¿åãæãé©ãã¦ããã¨æãã¾ãã ç»åãµã¼ããåæ£ã¹ãã¬ã¼ã¸ã§æ¤è¨ããå ´åã¯ãä»ã®ãããã¯ããå©ç¨ããã»ããããããã§ãã
Hadoopã¯HDFSã«ç¹åãã¦ãã®ãï¼
Hadoopãå©ç¨ããå ´åã¯ãå¿ ãããHDFSãã使ããªãããã§ã¯ãªãã§ããä»ã«ãAmazon S3, CloudStoreãªããã®é¸æè¢ãããã¾ããå®éã¯ãHDFSãä¸çªããã¥ã©ã¼ã§æ¨æºçã«å©ç¨ããã¦ããã®ã§ãä»åã¯ããã試ãã¦ã¿ã¾ããã
Hiveãã®ãããã§ãã
試ãã¦ã¿ã¾ãããããªããªããã®ããã§ãã RDBã¨è¦ªåæ§é«ããJoinãªãããã§ããã®ã¯é åçãªã®ã§ãå°å ¥ãæ¤è¨ãã¦ãã¾ãã
ã¯ãã¯ãããã§ã¯Hadoopã¯ã©ã®ããã«ä½¿ãããã®ï¼
ç´è¿ã§ã¯ãæ§ã ãªæ¡ä»¶ä¸ã§ã®ãã°è§£æããããã¯ã¨ã³ãã®DBæ´æ°ãªã©ã«å©ç¨ãããäºå®ã§ãã
ã¾ã¨ã
ã¯ãã¯ãããã«ããã¦ãã¼ã¿è§£æã®éè¦ãé«ã¾ã£ã¦ãããã¨ã§ãHadoopã¸ã®åãçµã¿ãã¾ã¨ãã¦ã¿ã¾ãããä»å¾ãæ¬æ ¼å°å ¥ãã¦ããéã«ã¯ã¾ãæ¹ãã¦ã¨ã³ããªãä¸ãããã¨æãã¾ãã
ã¾ãããã®ãããªåæ£ç°å¢ã«ããããã¼ã¿è§£æã«ã¤ãã¦ãèå³ããæ¹ããã¯ãã¯ãããã§ã¯åéãã¦ãã¾ãï¼