PyConJP2017ã®è³æãPython Spark PySpark PyConJP 2017 Apache Spark
PyConJP2017ã®è³æãPython Spark PySpark PyConJP 2017 Apache Spark
Apache Sparkãã¹ã«ã¼ãããã¨ã¬ã¤ãã³ã·ã両ç«ãããä»çµã¿ã¨ææ°ååããSparkã³ããã¿ã¨ãªã£ãNTTãã¼ã¿ç¿ç°æ°ã«èããï¼åç·¨ï¼ æè¿ããã°ãã¼ã¿å¦çåºç¤ã¨ãã¦æ¥éã«æ³¨ç®ãéãã¦ããã®ããApache Sparkãã§ãã Sparkã¯ãHadoopã¨æ¯è¼ããããã¨ãå¤ããHadoopãããé«éãã¤é«æ©è½ãªåæ£å¦çåºç¤ã ã¨è¨ããã¦ãã¾ããSparkã¨ã¯ãã£ãããã©ã®ãããªã½ããã¦ã§ã¢ãªã®ã§ããããï¼ ä»å¹´6æã«Sparkã®ã³ããã¿ã«å°±ä»»ããNTTãã¼ã¿ã®ç¿ç°æµ©è¼æ°ã«èãã¾ããã 以ä¸ã¯ç¿ç°æ°ãã伺ã£ãSparkã®ç´¹ä»ãã¾ã¨ãããã®ã§ããã¾ããå¾ç·¨ã§ã¯ç¿ç°æ°ãã³ããã¿ã«ãªã£ãçµç·¯ãªã©ãã¤ã³ã¿ãã¥ã¼ãã¾ããã Hadoopã§ã¯è¤éãªå¦çã«æéãããã Sparkã¨ã¯ãªã«ãã®åã«ãã¾ãã¯Hadoopã®è©±ããå§ãããã¦ãã ããã Hadoopã¨ã¯ããã£ããè¨ãã¨åæ£å¦çãã¬ã¼ã ã¯ã¼ã¯ã
Asakusa on Spark AsakusaãSparkä¸ã§åãããã«ãªãã¾ããã Asakusa on Spark (Developer Preview) â Asakusa Framework Developer Preview 0.2.2 documentation ãã§ã«å®éã«æ¬çªã«å©ç¨ãã¦ãã¾ãã ãã¼ãã©ã¹ã»ãã¯ããã¸ã¼ãºããããã¤ã³ã¿ã¼ãããã«Asakusa Frameworkã§éçºãã大è¦æ¨¡ãã¼ã¿ã®é«éå¦çåºç¤ãå°å ¥ãã顧客åä½ã§ã®ç²¾åº¦ã®é«ãå価è¨ç®ãå®ç¾é«éå¦çåºç¤ã¯Apache Sparkâ¢ã§æ§ç¯ | NAUTILUS OSSã¨ãã¦ã®å ¬éãè¡ãã¾ããã®ã§ãå 容ãä½ç½®ã¥ããã¾ã¨ãã¦ããã¾ããä¾ã«ãã£ã¦ãã¼ãã©ã¹ã¯ç¤¾å ã§ããããªæè¦ã¯å½ç¶åºã¦ãã¾ãããä»åã¯æ¦ãä¸è´ãã¦ããæãã§ãã ããã©ã¼ãã³ã¹ æ¦ããæ¥åãããå¦çã¨ãã観ç¹ã§è¦ãã°ããã¹ãããHadoopMapR
Apache Spark ãä»å¾æ´»ç¨ãã¦ããã«å½ãã£ã¦è¡ã£ããã¬æ¤è¨¼ã®çµæã«ã¤ã㦠slideshare ã«ã¢ãããã¼ããã¾ããã ããå°ã詳細ãªå é¨ã®å®è£ ãæåãè¦ã¦ãããªãã¨ã¨æãã¤ã¤ãããå æ¸ã¯ã¼ãã«ã¦ã³ãã飽ãã¦ããã®ã§ãã¼ã¿åæã«ãã©ã¤ãã¦ããããã¨èãã¦ããã¾ãã
ãªããDMMããweb3ã«åå ¥ããã®ããSeamoon Protocolããç®æãæ°ããªã¨ã³ã¿ã¡ä½é¨ã®æªæ¥ã¨ã¯
1. 1Copyright © 2014 NTT DATA Corporation NTTãã¼ã¿ åºç¤ã·ã¹ãã äºæ¥æ¬é¨ OSSãããã§ãã·ã§ãã«ãµã¼ãã¹ ç¿ç° æµ©è¼ 2014å¹´12æ17æ¥ JJUG ãã¤ãã»ã»ããã¼ Spark/MLlibã§ã¯ãããã¹ã±ã¼ã©ãã«ãªæ©æ¢°å¦ç¿ 2. 2Copyright © 2014 NTT DATA Corporation èªå·±ç´¹ä» ï® æå±/æ°å ï¬ NTTãã¼ã¿ åºç¤ã·ã¹ãã äºæ¥æ¬é¨ OSSãããã§ãã·ã§ãã«ãµã¼ãã¹ ï¬ ç¿ç° 浩è¼ï¼ããã ããããï¼ ï® ä½ããã¦ãã人? ï¬ OSSã使ã£ãR&Dãã·ã¹ãã éçºããã¯ãã«ã«ãµãã¼ãã«æºãã£ã¦ãã¾ã ï¬ 6å¹´ã»ã©å¤§è¦æ¨¡åæ£å¦çåºç¤ãHadoopãé¢é£ã®R&Dãã·ã¹ãã éçºã«é¢ãã£ã¦ãã¾ ãã ï¬ è¿å¹´ã¯Hadoopããæ´¾çãã¦ãã¤ã³ã¡ã¢ãªåæ£å¦çåºç¤ãSparkãã«ãæºãã£ã¦ãã¾ã ï® èä½ç©(å ±è) ï¬
2. èªå·±ç´¹ä» ï® æ¿±é è³¢ä¸æ ï¼ã¯ã¾ã® ãããã¡ããï¼ â æ¥æ¬Hadoopã¦ã¼ã¶ã¼ä¼ã®ã¡ã³ãã¨ãã¦ã ã¤ãã³ã Hadoop Conference Japan ã åå¼·ä¼ Hadoopã½ã¼ã¹ã³ã¼ããªã¼ãã£ã³ã° ã®ä¼ç»ã»å®æ½ãæ å½ â ç¿æ³³ç¤¾ ãHadoopå¾¹åºå ¥éã ç£ä¿®è â NTT DATA åºç¤ã·ã¹ãã äºæ¥æ¬é¨ OSSãããã§ãã·ã§ãã«ãµã¼ãã¹ ã«æå± â Hadoop é¢ä¿è ã§è©±é¡ã«ãªã£ã ãçµç£çã®å ±åæ¸ã ã®å®è¨¼äºæ¥ã®PM å¹³æï¼ï¼å¹´åº¦ ç£å¦é£æºã½ããã¦ã§ã¢å·¥å¦å®è·µäºæ¥å ±åæ¸ é«ä¿¡é ¼ã¯ã©ã¦ãå®ç¾ç¨ã½ããã¦ã§ã¢éçºï¼åæ£å¶å¾¡å¦çæè¡çã«ä¿ããã¼ã¿ã»ã³ã¿ã¼é«ä¿¡é ¼åã«åããå®è¨¼äºæ¥ï¼ http: //www.meti.go.jp/policy/mono_info_service/joho/downloadfiles/2010software_research/clou_dist
çªç¶è±èªã§ã¡ã¼ã«ããã£ã¦ãã¦ã¬ãã¥ã¼ãã¦ãããªããã¨é ¼ã¾ããé¢ç½ããã ããå¼ãåãã¦èªãã§ã¿ããæ¥æ¬èªã§ããã¬ãã¥ã¼è¨äºæ¸ããªããã©å¤§ä¸å¤«ï¼ ã¨ç¢ºèªããã大ä¸å¤«ã ã¨ãã£ã¦é»åæ¸ç±ãã¼ã¿ããããã¾ãããããããã¨ãã£ã¦ãä¼ç¤¾ããããªããã¤ã®ãªã¹ã®(é»åæ¸ç±å°éã®ï¼)åºç社ã¿ããã ãã©ã ãªããã©ã¼ããã㯠pdf, epub, mobi ã®ã©ãã§ããã¦ã³ãã¼ãã§ãããããããæ¥æ¬ã¯ãªããããããªãã®ã ã§ãèªãã ã76ãã¼ã¸ã®çãæ¬ã ãã£ããè¨ã㨠è±èªã ãã©ãããç°¡åãªè±èªã§æ¸ããã¦ã¦ããããã¦ç°¡åã«èªãããmanã¨ãæ®éã«è±èªã§èªãã§ã人ãªã楽åã ã¨æããèªãã°æ®éã«å°å ¥ããããããªã¯ã¨ãªãçºè¡ããã¨ããã¾ã§è¡ããããªãã¡ã¬ã³ã¹ã«ã¯ä½¿ããªã*1ãã©ãããã¯ã¾ããwikiãè¦ãã°ããããããªãã§ãããã åãããã¯ã«ã¤ãã¦ã¯ããªãçãããå¿ ãåæã«ãªããã¼ãã«ã®æºåãããããã®ã¯ã¨ãª*2
Azkaban Azkaban is a batch workflow job scheduler created at LinkedIn to run their Hadoop Jobs. Often times there is a need to run a set of jobs and processes in a particular order within a workflow. Azkaban will resolve the ordering through job dependencies and provide an easy to use web user interface to maintain and track your workflows. Here are a few features: Compatible with any version of H
ãã¼ã¿ãä¿¡é ¼ããAI ãä¿¡é ¼ãã ä¿¡é ¼ã§ãããã¼ã¿ãä¿¡é ¼ã§ããã¢ãã«ãä¿¡é ¼ã§ãã AI ãå®ç¾ããããã«ãããã»ã©å¤ãã®ã¯ã©ã¦ãã®ãã¾ãã¾ãªãã¼ã¿ã¿ã¤ãã管çã§ãããªã¼ãã³ãã¼ã¿ã®ã¤ããã¼ã·ã§ã³ã¨å¤§è¦æ¨¡å±éã«å¯¾å¿ã§ãããã©ãããã©ã¼ã ã¯ä»ã«ããã¾ããã
ãããã°ãã¼ã¿ããã¹ãã¼ãã«å¦çãããæ°ããéå ·ãããããªã¼ãã³ã½ã¼ã¹ããã«ã¦ã¨ã¢ã®Apache Hadoopï¼ããã¥ã¼ãï¼ã§ãããã®Hadoopã®åã§æ°ããªåçãç²å¾ããä¼æ¥ãå¢ãã¦ãã¾ããæ¬é£è¼ã§ã¯ãã®Hadoopãåºç¤ãã説æãã¦ããã¾ããã¾ãä»ããèããªãHadoopã®åºæ¬ãããä¸åº¦ãããããããã¢ã¼ããã¯ãã£ã¼ã解説ãåæ£ãã¡ã¤ã«ã·ã¹ãã ã®æä½ã¨MapReduceå¦çãã³ã¼ãã§ç¢ºèªãã¦ã¼ã¹ã±ã¼ã¹ãã¯ã©ã¹ã¿ç®¡çã«ãè¨åãã¦ããã¾ãã2012å¹´ã«äºå®ããã¦ãã次æã¡ã¸ã£ã¼ãªãªã¼ã¹0.23ã®æ å ±ãªã©ã®ææ°ãããã¯ãåãè¾¼ã¿ã¾ãã ç®æ¬¡
Macboook Airãªã©ã®ãã¼ã«ã«ç°å¢ã§Hadoopãæ°è»½ã«è©¦ãããã±ã¼ã¹ããããã¨æãã¾ããOSX 1å°ã ãã§Hadoopã使ãå ´åã®ç°å¢æ§ç¯æé ãã¾ã¨ãã¾ããã åè - å ¬å¼ãµã¤ã:Single Node Setup ç°å¢ OSX 10.8.4 Apache Hadoop 1.1.2 Java 1.6 ã¤ã³ã¹ãã¼ã«æé Homebrewã§Hadoopãã¤ã³ã¹ãã¼ã«ãã¾ãã brew install hadoop sshèªè¨¼ç¨ã®éµãä½æãã¾ãã ssh-keygen -t rsa -P "" cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys ãã·ã¹ãã ç°å¢è¨å®ãã®ãå ±æãã§ããªã¢ã¼ããã°ã¤ã³ããªã³ã«ãã¾ãã localhostã«ãã¹ã¯ã¼ããªãã§sshãã°ã¤ã³ã§ãããã¨ã確èªãã¾ãã ssh localhost è¨å®ãã¡ã¤ã«ãä¿®æ£ãã¾ãã
ãµã¨ãã°è§£æç°å¢ã«ã¤ãã¦ãªãã¨ãªãæ¸ãã¦ã¿ãããªã£ãã®ã§æ¸ãã¦ã¿ãã ãã¼ã¿ãµã¤ã¨ã³ãã£ã¹ããã¼ã ãªããã®ãããã¨æãã¾ããããã¼ã¿ãµã¤ã¨ã³ãã£ã¹ãã£ã¦è¨ã£ãã¨ãã«ãHadoop/Hiveãããã使ã£ã¦ãã¼ã¿ããããã«æ´çãã¦åæããããããã«å å·¥ãããã¼ã¿ã¨ã³ã¸ãã¢ï¼ææ°ãå½åï¼ã¨å å·¥ããããã¼ã¿ã使ã£ã¦åæããã¢ããªã¹ãã®ï¼ç¨®é¡ããã¨æããã§ãããã両æ¹ã§ããã°ãã¡ããè¯ããã§ããã©ãããã«ããã¯é£ããã®ã§åæ¥ãå¿ è¦ã§ãããã ã¢ããªã¹ããHadoopã®éç¨ãã§ããå¿ è¦æ§ã¯ä½ãã¨æãã¾ããSQLã¯æ¸ããæ¹ãè¯ãã¨ã¯æãã¾ãã ã¡ãªã¿ã«åã¯ä»äºã§ãã°è§£æå¨ããæ å½ãã¦ããç¾ç¶ã¯ã¢ããªã¹ãã§ã¯ãªããã¼ã¿ã¨ã³ã¸ãã¢ã§ããã KPIã¨ãããçµ±è¨å¤ã¯æ¯æ¥ã¬ãã¼ãã£ã³ã°ããä»çµã¿ã¯æ¢ã«ããã®ã§ããæ°ããçµ±è¨å¤ãç¥ãããã¨ããè¦æãã¡ããã¡ããããã¾ãã æè¿ã¯èªç¤¾ãµã¼ãã¹ãå®å®ãã¦éç¨ããã¦ããã®ã§
ããè¨ç·´ãããã¢ããã«ä¿¡è ãé½å ã§ããHadoop使ã£ã¦ã¾ããã試ãã«Hadoopã使ã£ã¦ã¿ãããã¨æã£ãæã«ä¸»ã«é害ã¨ãªãã®ã以ä¸ã®3ã¤ã§ãã Hadoopã®ã¯ã©ã¹ã¿ãçµãããã«å®æ©ãè¤æ°ç¨æããã®ãåä»ããããã¯ã©ã¹ã¿ã¨ãã¦çµã¿ä¸ããã®ãåä»ã Hadoopã®ä¸ã§åããã¢ããªã±ã¼ã·ã§ã³ãMapReduceã§æ¸ãã®ãåä»ã Hadoopã§å¦çããã»ã©ã®ããã°ãã¼ã¿ãç¨æããã®ãåä»ã 1ã¤ç®ã¯Amazon Elastic MapReduce (EMR)ã使ãäºã§ã¹ãã¼ãã«è§£æ±ºãã¾ãããã 2ã¤ç®ã«ã¤ãã¦ã¯ããªã¼ãã³ã½ã¼ã¹ã®MapReduceã¢ããªã±ã¼ã·ã§ã³ã使ãã¾ããç§ãå¼·ãèå³ãæã£ã¦ããåéã«ãæ©æ¢°å¦ç¿ãã¨ãããã®ãããã¾ããæ©æ¢°å¦ç¿ã¨ã¯ãã³ã³ãã¥ã¼ã¿ã«ãã¼ã¿ãåæãããæªç¥ã®æ å ±ã«ã¤ãã¦ã®äºæ¸¬ãããããã人éã®ç¥è½ã«è¿ãæ©è½ãå®ç¾ãããã¨ãã試ã¿ã§ããä»åã¯ããã®æ©æ¢°å¦ç¿ã®å種ã¢ã«
ãã¼ãã£ã·ã§ã³ãå©ç¨ãã ä»åã¯å°ãåã£ããã¼ãã«ãå®ç¾©ããã¦ã¿ã¾ãããã éµä¾¿çªå·ãã¼ã¿ã¯æ¯ææ´æ°ãããã®ã§ããã¼ãã«æå®æã«ãã¼ã¸ã§ã³ãæå®ã§ããããã«ãã¾ãããã®ãããªå ´åãHiveã§ã¯ãã¼ãã£ã·ã§ã³ã使ãã¾ãã 以ä¸ã«éµä¾¿çªå·ãä¿åãããã¼ãã«ãzipããå®ç¾©ãã¾ãããæ¥ä»åDATEã®ãã¼ãã£ã·ã§ã³verãè¨å®ããããã«ãã¾ãã hive> CREATE TABLE zip (zip STRING, pref INT, city STRING, town STRING) > PARTITIONED BY (ver DATE) > ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' > LINES TERMINATED BY '\n'; OK Time taken: 0.128 seconds
Hadoopï¼Hiveæ¤è¨¼ç°å¢ãæ§ç¯ãã¦ã¿ãï¼HiveââRDB使ãã®ããã®Hadoopã¬ã¤ãï¼åç·¨ï¼ï¼1/3 ãã¼ã¸ï¼ Hadoop Hiveã¯Hadoopä¸ã§SQLã©ã¤ã¯ãªã¯ã¨ãªæä½ãå¯è½ãªDWHåãã®ãããã¯ãã§ããSQLã«è¿ãæä½ãå¯è½ãªãããHBaseããããã¼ã¿ãã¼ã¹ã«æ £ã親ããã ã¿ãªããã«ã¯ä½¿ãåæãããããããã¾ãããæ¬ç¨¿ã§ã¯ãã®Hiveã®ä½¿ãæ¹ã¨ã¬ãã¥ã¼ãè¡ã£ã¦ããã¾ãã
ãªãããããã¤ãã³ãã ã£ããUser Group主å¬ã®ã¤ãã³ããªã®ã«2ãã¼ã«ã1æ¥è²¸ãåã(ã¨ãããæ½è¨ã¾ããã¨è²¸å)ã§ãã£ãã·ãã£1400人ã®ã¤ãã³ãã¨ãã©ããããã¨ããããç¡æåå ãªã®ã«ã©ã³ãããã¯ã¹ã¨ã飲ã¿ç©ã¨ãåºã¦ããæå³ããããããã«ãã®ããã¨ããã¯éãã¨ãããã¨ãã¼ï¼ ã¨ãããã¨ãããã主å¬è ã®æ¹ã ã¯ãç²ããã¾ã§ããï¼ å 容ã®ãµããªãèªã¿ãã人ã¯ãããªã¨ã³ããªãèªãã§ãªãã§ãä»ã®äººãã¾ã¨ãã¦ããã®ãããã®ã§ãã£ã¡ã«è¡ãã¾ãããã ããã¹ã£ã¦ãã ãã£ã¡ãå ã«çä»ãããã©ã¤ããã³ã°ãã¼ã¯ã§æéããã£ãã®ã§ãã©ã¤ããã¢ã§Hadoopããããªãã¨ã«ä½¿ã£ã¦ããããã®ããã«ãããªãã¼ã«ã欲ããã£ãããä½ã£ã¦ä½¿ã£ã¦ãããã¨ãã話ããã¦ããã Hadoop and subsystems in livedoor #Hcj11f View more presentations from tago
IBMãHadoopã®ãã¢ãã©ã¤ã¢ã³ã¹ãåãµã¼ããçºè¡¨ãã¾ããã - IBM PureData System for Hadoop H1001 ã¯ä¼æ¥ã«ããã Hadoop ã®ç°¡ç´ åãæ¯æ´ãã¾ã (製åçºè¡¨ã¬ã¿ã¼) - IBM PureData System for Hadoop IBM Pure Data System for Hadoopã¯ãIBM PureSystemsãã¡ããªã¼ã®ææ°è£½åã§ãããã®è£½åã«ãããã¦ã¼ã¶ã¼ã¯ããã¾ã§ä»¥ä¸ã«ã¹ãã¼ãã«ã·ã¹ãã ãç°¡ç´ åããè¿ éã«ãã¸ãã¹ã®ä¾¡å¤ãåµåºããITã®çµæ¸æ§ãåä¸ããããã¨ãã§ãã¾ããç¹å®ã®ã¯ã¼ã¯ãã¼ãã®ããã ãã«è¨è¨ãããPureData System for Hadoopã¯ãã¹ã¿ã³ãã¼ãã«åºã¥ãã¦å°é家ã®ç¥è¦ãçµ±åãã製åã§ãIBM InfoSphere BigInsightsã«ããHadoopãã¼ã¹ã®ã½ããã¦ã§ã¢ããµã¼ãã¼ãããã³
ãªãªã¼ã¹ãé害æ å ±ãªã©ã®ãµã¼ãã¹ã®ãç¥ãã
ææ°ã®äººæ°ã¨ã³ããªã¼ã®é ä¿¡
å¦çãå®è¡ä¸ã§ã
j次ã®ããã¯ãã¼ã¯
kåã®ããã¯ãã¼ã¯
lãã¨ã§èªã
eã³ã¡ã³ãä¸è¦§ãéã
oãã¼ã¸ãéã
{{#tags}}- {{label}}
{{/tags}}