- è¬æ¼è³æï¼ä¹ãé ãããªï¼Kafkaã¨Sparkãçµã¿åããããªã¢ã«ã¿ã¤ã åæåºç¤ã®æ§ç¯
èªç¥åº¦ãé«ã¾ã£ã¦ããããã¾ã ã¾ã Sparkã®æ´»ç¨ä¾ã¯å°ãªã
ããæ¬ã»ãã·ã§ã³ã§ã¯ããã¾ã§ãSparkã¨Kafkaã使ã£ã¦ãªã¢ã«ã¿ã¤ã åºç¤ãã©ãä½ã£ã¦ããã®ããå¾æ¥ã®Hadoopåºç¤ã¨æ¯ã¹ã¦ã©ãããã¡ãªãããããã®ãã«ã¤ãã¦èª¬æããããããã£ã¦ããã°ã©ãããã¯ãã¢ã¼ããã¯ãã£ã«æºããæ¹åãã®å 容ã¨ãªãã
ãããæåã«å®£è¨ããç°ä¸æ°ã®ã»ãã·ã§ã³ã¯å§ã¾ã£ããç°ä¸æ°ã¯ç¾å¨ãIBMã§HadoopãSparkã使ã£ã解æåºç¤ã®æ§ç¯ã«æºãã£ã¦ãããåè·ã§ã¯Webç³»ã®ä¼ç¤¾ã§å¤§è¦æ¨¡ã¢ã¼ããã¯ãã£è¨è¨ãå®è£ ããµã¼ããµã¤ãããã°ã©ã ãããã³ãã¨ã³ãããã°ã©ã ãæ å½ãã¦ãããã¨ãããããããããã«ã¹ã¿ãã¯ã¨ã³ã¸ãã¢ã§ããã
ãããã¾ã§ããã°ãã¼ã¿ã®å¦çåºç¤ã¨ãã¦ç¨ãããã¦ããã®ãHadoopã§ãããHadoopã®èªç¥åº¦ã¯é常ã«é«ããä¼å ´ã§ããããªãã®äººæ°ã®æ¹ããæ¥åã§ãã§ã«Hadoopåºç¤ã使ã£ã¦ãããã¨æãæãã¦ãããããã¦ä¸æ¨å¹´ã»ã©åãã注ç®ãéãã¦ããã®ããSparkã§ããã
ãSparkã¯å½åãã¤ããã¼ã¿ã¼ãã¢ã¼ãªã¼ã¢ããã¿ã¼ãä¸å¿ã«æ³¨ç®ãéããä»ã§ã¯ã¢ã¼ãªã¼ãã¸ã§ãªãã£ãã¬ã¤ããã¸ã§ãªãã£ã¸ã®åºãããè¦ãã¤ã¤ãããã¨ã¯ããSparkãæ¥åã§å©ç¨ãã¦ããä¼ç¤¾ã¯ã¾ã ã¾ã å°ãªãããæ¨å¹´æ«ããäºä¾ãç»å ´ãã¦ãããã¨ç°ä¸æ°ã¯èªãã
ãHadoopãSparkã®èªç¥åº¦ã®åä¸ã®èæ¯ã«ã¯ãããã°ãã¼ã¿ã®åºãããããããã½ããã¦ã§ã¢ã¯ãã¡ãããITã³ã³ãµã«ãéèç³»ããã¼ãã¦ã§ã¢ç³»ãæè²åéãå»çåéãéä¿¡ãã£ãªã¢ãåºåãECãªã©ã¨ããããã«ããã¾ãç¹å®ã®åéã§ã¯ãªãããã¾ãã¾ãªæ¥çã横æãã¦ããã°ãã¼ã¿ã¯å±éããã¦ãããã¨ç°ä¸æ°ããã¡ãããããã°ãã¼ã¿ã¨ãã£ã¦ãæ¥çã«ãã£ã¦ãã®ãã¼ã¿éã¯ãã¾ãã¾ã ãããã¨ãã°æ°åä¸PVãæ°åPVã®ã³ã³ãã³ããæããWebç³»ã·ã¹ãã ã§ã¯ãåç´ãªãµã¼ããã°ã ãã§1æéã«æ°GBï½æ°åGBã1æ¥ã«æç®ããã¨100GBè¿ãã®ãã°ãåãåºããããã¨ã«ãªãããã®ã»ãã«ãã¦ã¼ã¶ã¼ãã°ãã¢ããªã±ã¼ã·ã§ã³ãã°ãªã©ãåéã§ããã
ããã°ãã¼ã¿ã ãã§ã¯ãªããWebãµã¤ããã¼ã¿ãã»ã³ãµã¼ãã¼ã¿ããã°ãã¼ã¿ãã«ã¹ã¿ãã¼ãã¼ã¿ããªãã£ã¹ãã¼ã¿ããªãã¬ã¼ã·ã§ã³ãã¼ã¿ãã½ã¼ã·ã£ã«ãã¼ã¿ãªã©ãåéãããããã®ãã¼ã¿ãçµã¿åããã¦åæããããé«åº¦ãªæ©æ¢°å¦ç¿ãããããã¦æ°ããªä¾¡å¤ãçã¿åºããã¨ããã®ãããã°ãã¼ã¿ã§ãããæ°åï½æ°ç¾TBç´ãåæããããã®åºç¤æè¡ã¨ãã¦æ¡ç¨ããã¦ããã®ãããHadoopãä¸å¿ã¨ããHadoopã¨ã³ã·ã¹ãã ã ãã¨ç°ä¸æ°ã¯èªããåºç¤ã¢ã¼ããã¯ãã£ã®èª¬æã¸ã¨å ¥ã£ãã
ãªã¼ã½ããã¯ã¹ãªHadoop解æåºç¤ã§ã¯ä½ãåé¡ãªã®ã
ããªã¼ã½ããã¯ã¹ãªHadoop解æåºç¤ã¯ãHadoopï¼HDFSï¼MapReduceï¼ã¨Hadoopä¸ã§SQLãæ±ãHiveãæ©æ¢°å¦ç¿ãæ±ãMahoutã§æ§æãããã«ãããåä½æéå½ããã«éãããã¼ã¿ãHadoopä¸ã®HiveãMahoutã§1æ¥æ°åã®é »åº¦ã§è§£æãå®è¡ããã¹ãã¬ã¼ã¸ãBIã«åæ ã§ããããã«ãªããã以åã¯ãã®ãããªåºç¤ãææ¡ãã¦ãããã¨ç°ä¸æ°ã¯èªããã¨ããã®ãã1ã¤ã®RDBã§ä¿åä¸å¯è½ãªå¤§éãã¼ã¿ãHFDSãå©ç¨ãããã¨ã§ä¿åãå¯è½ã«ãªããã¨ãä¿åãããã¼ã¿ã«å¯¾ããåæãæ©æ¢°å¦ç¿ãç¾å®çãªæéã§å®è¡å¯è½ã«ãªãããã ã
ããããããã®ãªã¼ã½ããã¯ã¹ãªåºç¤ã«ã¯åé¡ç¹ããããã¨ç°ä¸æ°ã¯ææãããããã¯ããã¼ã¿ãåãåãHadoopã¸æ ¼ç´ããéã®ã¿ã¤ãã³ã°ããã®åºç¤ã ã¨ãããã®å ´åãæ°æéã«ä¸åº¦ãã¼ã¿ãã¤ã³ãããããå½¢ã¨ãªãããã®é 延ã¯ãªã¢ã«ã¿ã¤ã åæåºç¤æ§ç¯ã®ããã®æ·ã«ãªããã¨ç°ä¸æ°ã¯è¨ãã
ãããã ãã§ã¯ãªãããã¼ã¿ãå¦çããé¨åããã¼ã¿ãåæ ããéã«ãåé¡ãããã¨ããããã¼ã¿å¦çããé¨åã«ã¤ãã¦ã¯ãHDFSä¸ã®ãã¡ã¤ã«ãMapReduceã§å¦çãããã®ã¯ãã£ã¹ã¯ã¢ã¯ã»ã¹ãå¤ããªããããã¬ã¤ãã³ã·ãé«ãã¤ã³ã¿ã©ã¯ãã£ããªå¦çããªã¢ã«ã¿ã¤ã å¦çã«ã¯ä¸åããããã¼ã¿åæ ã«ã¤ãã¦ãããã¼ã¿åæ ãRDBã«ããå ´åãæ¸ãè¾¼ã¿ä¸å¯ã®æé帯ãå¤ããªãã¨å¥ã·ã¹ãã ã®è¶³ãå¼ã£å¼µããªã©ã®èæ ®äºé ãå¢ããçç£æ§ãæªããªããã¨ç°ä¸æ°ã¯è¨ãã