Hadoopã®è©±èãã¦ãã
ãHadoopãæè¡ã¨ãµã¼ãã¹ã®ä¸¡é¢ããå¦ã¶åå¼·ä¼ãã§Hadoopã®è©±ãèãã¦ãã¾ããï¼*1
- ã¤ãã³ããã¼ã¸
大è¦æ¨¡åæ£å¦çåºç¤Hadoopæ´»ç¨ã®ã«ã³ãã³ã(ç¿ç°ãã)
- Hadoopã£ã¦ï¼
- OSSã«ãã大è¦æ¨¡åæ£å¦çãã¬ã¼ã ã¯ã¼ã¯
- éä¸ç®¡çåã®ã¯ã©ã¹ã¿æ§æ
- ãã¼ã¿ã¯ãããã¯ã«åå²ãã¦è¤æ°ã®ãµã¼ãã«åæ£é ç½®
- å¦çãåå²ãè¤æ°ã®ãµã¼ãã§åæ£å¦ç
- ãããå¦çã«å¨åãçºæ®
- æ°æéãæ°æ¥ã®å¦çãçæéã§å¦ç
- é«ã¹ã«ã¼ããããªã·ã¼ã±ã³ã·ã£ã«I/O
- 大ããªãããã¯ãµã¤ãºãããã©ã«ã64MB
- ãã¼ã¿ãã¼ã«ãªãã£(å¦çã«å¿ è¦ãªãã¼ã¿ãæã£ã¦ãããµã¼ãã«å¦çãå²ãå½ã¦ã)
- NameNode/JobTracker(ãã¹ã¿ã¼)
- ãããã¯ç®¡çãç£è¦
- DataNodeç¶æ ç£è¦
- ã¡ã¿æ å ±ç®¡ç
- DataNodeã¨ã¯éã£ã¦SPOF
- DataNode/TaskTracker(ã¹ã¬ã¼ã)
- 1ã¤ã®ãããã¯ãè¤æ°ã®DataNodeã§ä¿åãã©ãããå£ãã¦ããã¼ã¿ã失ããªã
- ãããã誤解
- 誤解1
- à é«éãªRDBMS/åæ£ãã¡ã¤ã«ãµã¼ã
- â 大容éã«ç¹åãããããã·ã¹ãã
- ãªã³ã©ã¤ã³å¦ç(ä½ã¬ã¤ãã³ã·å¦ç)ã¯ä¸åã
- å°ããªãã¼ã¿ãã©ã³ãã ã¢ã¯ã»ã¹ã«ã¯ä¸åã
- 誤解2
- RDB ï¼ ãã¼ã¿ã¯æ£è¦åããã
- ãã¼ã¿ã管çããã¨ãã観ç¹ããããã¼ã¿éè¤ãé¿ãããã
- Hadoop ï¼ ãã¼ã¿ããã¦æ£è¦åãããªãå ´åãå¤ã
- é«ã¹ã«ã¼ãããã®ãããããããããããé åã¯ä¸å¾æ
- RDB ï¼ ãã¼ã¿ã¯æ£è¦åããã
- 誤解1
- å¯ç¨æ§åä¸ã®ä»çµã¿ããã£ã±ãâ¦ã ãã©
- MapReduce
- JobTrackerãSPOF
- Hadoop2.0ã®æ°ããªåæ£å¦çãã¬ã¼ã ã¯ã¼ã¯ãYARNãã§ã¯Zookeeperã§ãã¹ã¿ã¼ã®å¯ç¨æ§ãåä¸
- HDFS
- NameNodeãSPOF
- Hadoop2.0ã§ã¯NameNode HAã¨ããå¯ç¨æ§åä¸ã®ããã¿ãå°å ¥ããã¦ãã
- æ¯ããæè¡ãé§ä½¿ããã¨ãã解決æ¡
- Pacemaker(Heartbeat)ãªã©ã®HAã¯ã©ã¹ã¿ãªã³ã°ã½ããã¦ã§ã¢ã¨DRBDãªã©ã®ãã£ã¹ã¯ãã©ã¼ãªã³ã°ã½ããã¦ã§ã¢ãçµã¿åãããæ¹å¼ãå©ç¨ã§ãã
- MapReduce
- æ°ç¾å°ä»¥ä¸è¦æ¨¡ã®Hadoopã¯ã©ã¹ã¿ã®éç¨ä¸ã®èª²é¡
- åææ§ç¯ãè¨å®å¤æ´ã大å¤
- å°æ°ãå¤ããªãã°å½ç¶ãã©ãããæ éãã¦ãããçã¯é«ããªã
- ãªãã¬ã¼ã·ã§ã³ã®ãã¿ã¼ã³ãæå°éã«æããããã«ãã
- çµ±ä¸ãããéç¨è¨è¨ã§ãªããã¹ãæé¤
- é害çºçæã®ä¾å¤å¯¾å¿ãæå°å
- 対å¿ã¸ã®æè¦æéã®ææªå¤ãå¶å¾¡
- åææ§ç¯ãè¨å®å¤æ´ãå¢è¨ãé害å復
- â OSã®èªåã¤ã³ã¹ãã¼ã«
- è¤æ°å°ã®ãµã¼ãã«åæã¤ã³ã¹ãã¼ã«
- â æ§æ管ç
- åææ§ç¯ãè¨å®å¤æ´ãå¢è¨ãé害å復
- éç¨ã®ç°¡ç´ åã®ããã®å²ãåã
- ç´°ããé害åå ã®åãåãã¯ããªã
- å£ãããå¼ã£ãæãã¦æ°ããã®ã«å¤ãã
- 大è¦æ¨¡ãã¼ã¿ã®ä½ã¬ã³ãã³ã·ã¢ã¯ã»ã¹ãéæããæè¡
- HBase
- ææ°åå
- Hadoop2.0ç³»(alpha)
- NameNode HA
- ZooKeeperãå©ç¨ããActive-Standbyæ§æ
- NameNodeã®ã¡ã¿æ å ±ãåå²ãã¦ä¿æ
- NameNode1å°ãããã®å¿ è¦ãªã½ã¼ã¹éãä¸ãã
- YARN
- ãªã½ã¼ã¹ç®¡çã¨ã¸ã§ãã®ã©ã¤ããµã¤ã¯ã«ç®¡çãåé¢ãããã¨ã§1ä¸å°ãããã¾ã§ã¹ã±ã¼ã«ããããã«ãªã
- (MapReduceã ã¨4000å°ç¨åº¦)
- NameNode HA
- CHDH4
- Hadoop2.0ãã¼ã¹ã®ãã£ã¹ããª
- Hadoop2.0ç³»(alpha)
- ã¾ã¨ã
- Hadoopã¯HDFSã¨M/Rã§æ§æããã大è¦æ¨¡åæ£å¦çåºç¤
- Hadoopèªä½ã¯ä½ã¬ã¤ãã³ã·å¦çã¯è¦æããããã«ããé«ã¹ã«ã¼ãããå¦çãå¾æ
- ãã ãå¨è¾ºã®ã¨ã³ã·ã¹ãã (HBase)ã«ããéæã§ãã
- éç¨ãè¦éã«å ¥ããã¤ã³ãã©è¨è¨ã大äº
ã½ã¼ã·ã£ã«ã²ã¼ã ã«ãããHadoopæ´»ç¨äºä¾(Gloops äºæ¾¤ããã»æ»ãã)
- 解æã®æµã
- Hadoopã¯ãã¼ã¿ãèç©ãã¦ããå ´æ
- ãã¼ã¿ãèç©ããããã«ã¦ã¼ã¶ã®ã¢ã¯ã·ã§ã³æã®ãã¼ã¿ããã°ã«åºå
- èç©ããããã¼ã¿ã®æ´å½¢éè¨
- å®æçã«BIãã¼ã«ã«åãè¾¼ãã§åæ
- Hadoopã¨RDBMSã®è§£æã®éã
- RDBMã¯ãã¼ã¿éãå¤ãã¨å¦çãå°é£
- HadoopãªãM/Rã§åæ£å¦çå¯è½
- RDBMSã¯1ã¤ã®ãã¼ãã«ã«éå»ç¾å¨æªæ¥ã®ãã¼ã¿ãæ··å¨
- Hadoopã¯ãã°ã®åãæ¹ã«ãããæ¥ä»ãã¨ã«ãã¼ã¿ãåããããç°¡åãã¢ã¯ãã£ããª(ã ã£ã)ãã¼ã¿ãåããã
- RDBMSã¯æ´æ°å¯è½ãéå»ã®ãã¼ã¿ãæ®ããªãå ´åã
- Hadoopã¯æ¿å
¥åãéå»ã®ãã¼ã¿ãæ®ãã
- ç»é¢é·ç§»ã«æãã解æã¨ããRDBMSã§å ¨URLé·ç§»ã¨ãã¯æ®ãã¦ãããªã
- RDBMSã§ã¯ã¢ã¯ãã£ãã¦ã¼ã¶ã®ã¿ã®è§£æãå°é£ãªå ´åããã
- éå»ç¾å¨ã®ãã¼ã¿ãæ··å¨ãã¦ããã®ã§çµåãããåéè¨ããããã¦æ»ã¬
- 巨大ãªãã¼ã¿ãèç©ãå¦çã§ããã®ã§ãããªãã¼ã¿ãå
¥ãã¦ããã¨è¯ã
- URLã®é·ç§»ãªã©ã®RDBã«ä¿åããªããããªç´°ãããã¼ã¿
- ã¬ãã«ãææã¢ã¤ãã ãªã©æ´æ°ãã¦ãªããªã£ã¦ãã¾ãç³»ã®ãã¼ã¿
- ãã ããã°æ¸ãè¾¼ã¿ã§é ããªã£ããæ»ã¬ã®ã§æ¸ãã ãã¿ã¤ãã³ã°ã¯èãã
- Hadoopãã°ã®éè¨
- Pigã使ãã°M/Rãç´æ¥æ¸ãããã¯æ¥½
- ã¨ã¯ããSQLã¨ã¯æ¦å¿µãéãã®ã§ãã®ã¾ã¾ã¨è¨ã訳ã«ã¯ãããªã
- Pig
- å®è¡æã«å®ç¾©ã§ããããªãã·ã§ã³(ã¹ãã¼ãã¬ã¹)
- ãã©ã³ã¶ã¯ã·ã§ã³ãã¤ã³ããã¯ã¹ããªã
- é ã(è£ã§M/Råãã®ã§ä½ã¬ã³ãã³ã·å¦çã¯ç¡ç)
- UDF(ã¦ã¼ã¶å®ç¾©é¢æ°)ãä½ãã¦ã«ã¹ã¿ãã¤ãºæ§ãé«ã
- Hive
- SQLã«ä¼¼ã¦ããã®ã§å¦ç¿ã³ã¹ãã¯ä½ã
- ç°¡åãªãã¼ã¿æ½åºã«åãã¦ã
- Pigã使ãã°M/Rãç´æ¥æ¸ãããã¯æ¥½
- Hadoopå°å
¥è¦å´ãããã
- ãã°åãè¾¼ãã®é¢å
- ãã®ãã¼ã¿ãã®ç»é¢ã«æã£ã¦ãªãã
- åºåãããã¼ã¿é ç®ãåå¾ããã®ã«ç¡é§ãªDBã¢ã¯ã»ã¹ãå¿ è¦ã«ãªãã¨ã½ã·ã£ã²ã¼ã¯æ»ã¬
- æ¬å½ã«ãã®ã¿ã¤ãã³ã°ã§åºåãã¹ããã®ãèãã
- Pigåããã¥ãã
- ãã°åãè¾¼ãã®é¢å
- ã½ã·ã£ã²ã¼ã«ããããã¼ã¿ãã¤ãã³ã°
- ãã¤ãã³ã°ã¨ã¯é±å±±ããé±ç³ãªã©ãæ¡æãããã¨
- ãã¼ã¿ã¨ããé±å±±ã®ä¸ããéèãè¦ã¤ãåºãå®ãæãåºããã¨
- ãã¼ã¿ã¯ã客æ§ã®å£°ãèªã
- ãã¼ã¿åæã¯ã客æ§ãµãã¼ãã¨åã
- æ§ã ãªåé¡ã®çãã¯ã客æ§ã®å£°ã®ä¸ã«ãã
- Hadoopå°å
¥ã®ç®ç
- ãã¼ã¿ã®ä¸ã«çãããã
- åºæ¥ãããã大éã®ãã¼ã¿ãä¿æããã
- Hadoopãªãåæãå¹æçã«å¹ççã«å®æ½å¯è½ã«
- gloopsã«ãããHadoopã®æ´»ç¨äºä¾(ãã¸ã²ãé¡æã«)
- ã²ã¼ã ã®æ±
- èªåãªãªã¸ãã«ã®æå¼·ããããåµããããã
- ã«ã¼ããåå¾ããã欲æ±
- ã¬ãã£ãªã©
- ãã¬ã¼ã
- å¼·å欲æ±
- ãããè¨å®
- ã«ã¼ãå¼·å
- è
試ãã試åãã¤ãã³ã
- ã¦ã¼ã¶ããã«
- ã¢ã³ã¹ã¿ã¼è¨ä¼
- 決å®æ¨ãæ´»ç¨ããåæ
- 決å®æ¨ ï¼ æ©æ¢°å¦ç¿ã«ãããäºæ¸¬ã¢ãã«ã®æ§ç¯
- 決å®ã«å¯¾ãã¦é¢é£æ§ã®é«ããã©ã¡ã¼ã¿ã®æ½åº
- ç¶ç¶ã¨é¢è±ã®åæã«æ´»ç¨
- 決å®æ¨ ï¼ æ©æ¢°å¦ç¿ã«ãããäºæ¸¬ã¢ãã«ã®æ§ç¯
- 大åPRåã«åæã¦ã¼ã¶é¢è±è¦å ãæ¹åããã
- ããæéã«ã¢ã¯ãã£ããã¤Lv5-10ã®ã¦ã¼ã¶ã対象ã«ã¢ã¯ãã£ãã¨ã¹ãªã¼ãã1000件ãã¤ã©ã³ãã æ½åº
- ããæéã®å種ãã©ã¡ã¼ã¿ãè¡åå±¥æ´ãæ¯è¼
- ã¬ãã«ãåéã®äººæ°ãã¬ãã£ã®åæ°ããªã©ãªã©
- 決å®æ¨åæãªãSQLã§ãã§ããããããï¼
- ã§ãããã©Hadoopãªããã¹ã¦ã®ãã¼ã¿ãèç©(æ´æ°ãããªãã¦)ã§ãã
- ãã®æã ã®ç¶æ³ãæ½åºã§ãã
- ããåæã®ç²¾åº¦ãé«ã¾ã
- 大è¦æ¨¡ãã¼ã¿ãç°¡åã«åæåºæ¥ãã°çéPDCAãå¯è½ã«ãªã
- ã²ã¼ã ã®æ±
- ãã¼ã¿ãã¤ãã³ã°æ¥åãå®æ½ããä¸ã§éè¦ãã¹ããã¨
- what
- 解決ãã¹ãåé¡ã¯ä½ãªã®ã
- where
- åé¡ã¯ã©ãã§èµ·ãã£ã¦ããã®ã
- ãã¼ã¿ããç¹å®ã§ããã®ã¯ããã¾ã§
- why
- åé¡ã¯ä½æ èµ·ãã£ã¦ããã®ã
- How
- 解決ããããã«ä½ããã¹ãã
- åé¡è§£æ±ºã®ããã«ããããªãHow(解決æ段)ã«ããã£ã¦å½ã¦ãã£ã½ãªå¯¾çãæããªãäºã大äº
- what
ã¾ã¨ãã¨ã
Hadoopåä½ã§ã¯ããã¾ã§ããã°ãã¼ã¿ç¨ã®é«ã¹ã«ã¼ãããåæ£å¦çåºç¤â¦ãããããããç¨ã®ã·ã¹ãã ã§ä½ã¬ã¤ãã³ã·å¦çã«ã¯åããªãããå ãã¦MapReduceèªä½ãæè»ãªã¢ã«ã´ãªãºã ã§ã¯ãªãã®ã§ãä½ã«ã§ãé©ç¨ã§ããã¨ããããã§ã¯ãªãæãã*2
ãã°è§£æåéã«ããã¦ã¯ããã£ã±ãã容éæ°ã«ããä½ã§ãããã§ãæ®ãã¦ããããã¨ããã®ã¯å¼·ãã¨æãã
ãã ããéç¨è¨è¨ãã¤ã³ãã©è¨è¨ã¯ãã¸å¤§äºããªã«ãæ°åå°ã®ã¯ã©ã¹ã¿ãªã³ã°ç°å¢ã§éç¨ããã®ã§ãã®è¾ºã§ä¸æããã¨éç¨ã§æ»äººãåºãã