ä»æä¸ã«å®é¨ã®å®è£
ãçµãããããã§ãªãã¨æ¥æã®æ稿ãåã«éã«åããªãã®ã§ãä»é±ããç 究室ã®ãµã¼ãã« Hadoop ãã¤ã³ã¹ãã¼ã«ãã¦ããã
ç 究室ã«ã¯ãµã¼ãã20å°å¼±ããã®ã ãããã®ãã¡10å°å¼·ã使ããã¨ã«ãã¦è¨å®ããããããã®è¦æ¨¡ã ã¨ã大è¦æ¨¡ãã¨è¨ãã®ã¯æããããããããªãã(Yahoo! ã Google ã¨æ¯ã¹ã¦ãã¨ããæå³ã§ã)ãä¸è¦æ¨¡ããããã«ã¯è¨ã£ã¦ãããã ãããããã¶ããå¤ãã®å¤§å¦ãä¼æ¥ã§ä½¿ããå°æ°ããããããã ã¨æããã大ä¼æ¥ã«ããªãã¨ã§ããªãç 究ãããã®ã大å¤ä¾¡å¤ãããããä»ã®äººãã¡ãããæ°ã«ãªãã°çä¼¼ã§ããç 究ãããã®ã(ãã¼ã¿ãã¤ã³ãã©åè² ã§ã¯ãªãã¢ã¤ãã¢åè² ã«ãªãã®ã§è¦ããã¯ããã®ã ã)éè¦ã ã¨èãã¦ããã
ãã¨ãã°ãæ°å°ã§ãåæ£ç°å¢ã®æ©æµãåãããããã¨ããã®ã¯PFI が出した Hadoop の解析資料ã§ç¥ã£ã¦ããã®ã§ãåãã¦å°å
¥ããã¨ãã¯åèã«ãªã£ãããããããæ
å ±ããããªãã«å¤ã«åºã¦ããã®ã¯ãããããã以å @ohkura ãããå
¬éãã¦ããスライド(ã³ã¡ã³ãæ¬åç
§ããã¼ãããããããã¨ããããã¾ãï¼)ãããããèªãã§ããããMapReduce ã§ãããªããããªèªç¶è¨èªå¦çï¼æ©æ¢°å¦ç¿ã®ã¢ã«ã´ãªãºã ãå®è£
ã§ããã®ããã¨ã³ã£ããããã®ã§ãèªåãæ¸ããé¨åã¯ãããã£ã¦åºãã¦è¡ãããã¨æããããããã°ãData Intensive Text Processing with MapReduce ãäºç´å¯è½ã«ãªã£ã¦ããã
Data-Intensive Text Processing With MapReduce (Synthesis Lectures on Human Language Technologies)
- ä½è : Jimmy Lin,Chris Dyer
- åºç社/ã¡ã¼ã«ã¼: Morgan and Claypool Publishers
- çºå£²æ¥: 2010/08/15
- ã¡ãã£ã¢: ãã¼ãã¼ããã¯
- ã¯ãªãã¯: 67å
- ãã®ååãå«ãããã° (6件) ãè¦ã
ããã§ãHadoop ãå ¥ãã¦å¾æ¥ã®å¦çãã©ããããéããªã£ãããªãã¨æã£ã¦@nokuno ããã®Wikipediaによるテキストマイニング入門ã§æ¥æ¬èªçã®Wikipediaã®åèªé »åº¦ãæ°ããåé¡ããã£ã¦ããã®ã§ãæ¯è¼ãã¦ã¿ããå½¼ã®è¨å®ã§ã¯åèªåãã¡æ¸ãã¾ã§æ¸ãã ç¶æ ã§ã¹ã¿ã¼ããã¦ããããã858å21ç§ããã£ãã¨æ¸ãã¦ããã®ã ããç 究室ã®ãµã¼ãç°å¢ãã¤ã¾ã
- Quad-core Opteron (2.3GHz) x 8 CPU = 32 CPU ã³ã¢+256GBã®ã¡ã¢ãªã®ãµã¼ãã5å° (ãã¡28 CPU ã³ã¢ã使ç¨ã4 CPU ã³ã¢ã¯ä»ã®äººã®ããã«æ®ãã¦ãã)
- Quad-core Xeon (2.66GHz, 3.0GHz) x 2 CPU = 8 CPU ã³ã¢+32GBã®ã¡ã¢ãªã®ãµã¼ãã4å° (ãã¡ã㯠8 CPU ã³ã¢å ¨é¨ä½¿ã)
ã®ã¹ã¬ã¼ãè¨9å°ã§ããã¼ã¿ãåæ£ãã¡ã¤ã«ã·ã¹ãã ã«ç½®ããã¨ãããã試ãã¦ã¿ãã¨ããã59ç§ã§çµäºããã¼ã¹ã«ãªã£ã¦ãããã·ã³ã®é度ãéãã®ã§åç´ã«æ¯è¼ã¯ã§ããªãããç´900åé«éã«ãªã£ã¦ããã900åã¨è¨ãã¨æ°å¹´éãããå®é¨ãæ°æ¥ã§çµããè¨ç®ãªã®ã§ãå®é¨ãããããã¨æãã(åèã¾ã§ã«ãè±èªç Wikipedia ã§åèªã®é »åº¦ãã«ã¦ã³ãããå ´åã5.8GB ã®ãã¼ã¿ã§103ç§ã§ãã£ã)ãã¡ãªã¿ã«ãããããåç´ãªå¦çã®é度㯠CPU ã³ã¢æ°ã«æ¯ä¾ããã®ã§ãåããã·ã³ä¸ã§æ¯è¼ããã¨ç´200åé«éã«ãªãç¨åº¦ã§ããããã¾ãã200åã§ãååéãã¨æããâ¦â¦ã(@nokuno ããè£è¶³ãããã¨ããããã¾ã)
ç 究室ã®ãµã¼ãã CPU ã³ã¢æ°ããå¤ããé度ããããéãããã§ã¯ãªãããç 究室ã«ãã¡ã¢ãªããå°ãªãã1ã¹ã¬ãããªããã£ã¨éãè¨ç®æ©ãä½å°ãããã®ã ãã並ååã§ãããããéããªããªãã並ååå¯è½ãªã¿ã¹ã¯ã§ããã°ãã価å¤ããããããªãããª? (ãã¡ãããæ¬æ¥ã¯1å°ã§ãããªãåæ£ãã¡ã¤ã«ã·ã¹ãã ã«ç½®ãå¿
è¦ããªãã®ã§ã転éããã¨ãããæéã«å«ããªãã¨ãããªãã®ã ãã2.4GB ã® put ã¯æ°åç§ã§ãã£ã)
ãããã«ãããM1 ã®äººãã¡ã®ç 究ãæ¬æ ¼çã«å§ã¾ãä»å¹´ã®å¤ä¼ã¿ã¾ã§ã¯ãèªåã®å¤§å¦ã§ã®æéã¯ããããã¤ã³ãã©æ´åã«è²»ãããã¨æã£ã¦ããã®ã§ããã¨ã¯ç´°ã
ã¨ç 究ãç¶ãã¦ããããã
PFI ã¨ããã°情報科学科 OB 座談会「僕らは技術の王道を駆けて行く」㧠PFI ã®åµæ¥å½æã®è¦å´ãã©ããç®æãã¦ãããã大å¦ã®ææ¥ãããã«å½¹ã«ç«ã£ããã¨ãã話ãããããæ¸ãã¦ãã¦ãããããããå人çã«ã¯10年後の技術ã¨ãã話ããªãã»ã©ãã¨æãã
- 岡éå
- æ¤ç´¢ã¨ã³ã¸ã³ã¯ã10å¹´åã¯ã¾ã£ãã注ç®ããã¦ããªãæè¡ã ã£ããã§ãããç 究ãããã¦ãã£ãä¼æ¥ãå¤ãã¦ãæ¤ç´¢ã®å¯è½æ§ãä¿¡ãã¦ããä¸é¨ã®ä¼æ¥ãç 究è ãå°éã«éçºãç¶ãã¦ããã
- 西å·
- 20å¹´åãªããåãã¯ãã¼ã¿ãã¼ã¹ãæããã¦ããããããã¾ããã10å¹´åã ã£ããä½ã ã£ãããã10å¹´å¾ãªãâ¦â¦ï¼
- 岡éå
- ãã¾ãã10å¹´å¾ã«æåããæè¡ã¯ããã§ã«åå¨ãã¦ãã¦ä¸è¦ã¤ã¾ããªãè¦ããæè¡ãããããªããã¨åã¯èãããã¨ãããã¾ããä¾ãã°ãé³å£°èªèãã°ã¼ã°ã«ãå®ç¨åããããã©ããããããã£ã¨ããããããªãã¨æããããã¨ãæ©æ¢°ç¿»è¨³ãæ¥æ¬èªã¯ã¾ã ã ããã©ãæµ·å¤ã®åå½èªéã§ã¯ã©ãã©ã精度ãä¸ãã£ã¦ãã¦ãã¾ãããã¨ãç»åèªèã«ã¯å®ç¨åããã¦ããªãããããæè¡ãããããããã¾ãããã®ããããã10å¹´å¾ã«ã¯ççºçã«é²æ©ãã¦ã人ã ã®çæ´»ãå¤ãã¦ãããããªæ°ããã¾ãã
èªåãåæãæ©æ¢°ç¿»è¨³ã¯èªåã NAIST ã«å
¥å¦ãã2005å¹´å½æã¯ãæ©æ¢°ç¿»è¨³ãç 究ãã¼ãã«ããã®ã¯ããããèããã»ãããããã¨è¨ãããããããç 究ã«ããã®ã¯å³ãããã¼ãã§ã¯ãã£ããããã®å¾5å¹´ã§å¤§ããæ§å¤ãããããã¦ã§ãããæ½åºãã大è¦æ¨¡ãã¼ã¿ã使ããããã«ãªã£ãããè¨ç®æ©ãé«éã«ãªãã並ååæ£è¨ç®ã®ãã¯ããã¯ãå®ç¨åããã¦ããã®ã§ãããããæ
å ±ç§å¦ã®ç²ã極ãã¦ä½å½ããããã°ãã¬ã¤ã¯ã¹ã«ã¼ãèµ·ããã®ã ãªãã¨æãã
é³å£°èªèãç»åèªèã¨ããã°ãèªç¶è¨èªã§èªèã£ã¦ãªãã ãããã¨èãããããã®ã ãããã¯ãè¨èªã§èªèãããã®ã¯æå³ãªããããªãããª? ããè¨èªã§æ¸ãããæããæå³ãåãå¥ã®è¨èªã«ããã®ãæ©æ¢°ç¿»è¨³ã ã¨ããã¨ãæ©æ¢°ç¿»è¨³ãæå³ã¨ããåé¡ã«ã¿ãã¯ã«ãã¦ãããç¾å¨ã®çµ±è¨ç¿»è¨³ã®æ çµã¿ã§ã¯ãåå¥ã«æå³ãæ¨å®ããã®ã§ã¯ãªãã大è¦æ¨¡ãªãã¼ã¿(対訳æã¨ã)ããããã¨æ¾ãè¾¼ãã§åã¥ãã§ç¿»è¨³ããã®ã主æµã§ãã£ã¦ãæå³è§£æã§ããã£ã¨ä»å¾10å¹´ã¨ããã®ã¯ããããæ代ãªã®ã§ã¯ãªããã¨èãã¦ãã(ãã¾ãããããæ°æã¡ã§ãã£ã¦ãã人ããªããããããªããã©â¦â¦)ã
大è¦æ¨¡ãã¼ã¿ãæ±ãã®ã£ã¦ããããã£ã¦å°éã«ã¤ã³ãã©ããã¼ã«ãæ´åããããã³ãã³ããã¼ã¿ãåå¾ããããå°ãã¯ãã¼ã¿ã«ã¿ã°ä»ãããããæ³¥èãä½æ¥ãããããã¨ããã®ã ããå¦çããããããã¨è¯ã
ããé¨åããè¦ããªã(ãGoogle ã®ãããªæ¤ç´¢ã¨ã³ã¸ã³ãä½ãã®ã«æ§ãã¦èªç¶è¨èªå¦çã«æ¥ã¾ãããã¨ãã人ã¯ãã£ããå¤ã)ã®ã ããããé¢åãããé¨åã¯èª°ãã決å¿ãã¦1åããã°ããã ãã®ãã¨ãªã®ã§ãèªåããããé¨åã¯æ´åãã¦ããã¦ããããããªãã¨æã£ã¦ãããGentoo Linux ã®ããã±ã¼ã¸ãããããä½ã£ã¦ããã¨ãããä¸çä¸ã§èª°ããããã°ã¿ããªåããã¨ãããªãã§æ¸ãããããããªãèªåãããããã¨æã£ã¦ãã£ã¦ããã®ã§ãããããä»äºã好ããªã®ãããããªãã
ããããã° PFI ã今年のサマーインターンの募集を始めたãããªã®ã§ãè
ã«ãã¼ãã®ããå¦çããããããã¯æå
端ã®ç 究éçºç°å¢ã®ä¸ã§ä¸ã®ä¸ã«ã¤ã³ãã¯ãã®ããä»äºããããã¨ããå¦çããããã²ã©ããï¼