Sparkã§jarãã¡ã¤ã«ããããã°ã©ã ãå®è¡ãã
åå ã§Sparkã®æ¬ä¼¼åæ£ç°å¢æ§ç¯ã¨ã¤ã³ã¿ã©ã¯ãã£ãã·ã§ã«ä¸ããã®ããã°ã©ã å®è¡ã¾ã§è¡ã£ãã®ã§ãä»åã¯jarãã¡ã¤ã«ããã®ããã°ã©ã å®è¡ãè¡ãã¾ãã
ç°å¢
- CentOS6.6
- CDH 5
- IntelliJ IDEA 14
åèãµã¤ã
- http://qiita.com/imaifactory/items/823caa33639196f5459a
- http://kubotti.hatenablog.com/entry/2015/10/02/160104
Sparkã¢ããªã±ã¼ã·ã§ã³ã®éçºç°å¢æ§ç¯
Sparkã¯IntelliJã§ã®éçºãæ¨å¥¨ãããã®ã§ãä»åã¯IntelliJã使ãã¾ããIntelliJã®ã¤ã³ã¹ãã¼ã«åã³SBTãã©ã°ã¤ã³ã®ã¤ã³ã¹ãã¼ã«ã¯çç¥ãã¾ãã
SBTããã¸ã§ã¯ãä½æ
- File->New->Project
- SBTãé¸æ
sbt-assemblyã®ã¤ã³ã¹ãã¼ã«
Sparkã§JARãã¡ã¤ã«ãå®è¡ããå ´åã¯ãåç¬ãã¡ã¤ã«ã§èµ·åå¯è½ãªJARã«ããäºã¯æ¨å¥¨ããã¦ãã¾ãããªã®ã§sbt-assemblyã使ã£ã¦JARãä½æãã¾ãã
sbt 0.13.6以éãªã以ä¸ã®1è¡ãplugin.sbtã«è¿½å ããã ãã§OKã§ãããã以åã®å ´åã¯https://github.com/sbt/sbt-assembly#setupãåèã«ãã¦ãã ããã
addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.14.0")
build.sbtã«ä¾åã©ã¤ãã©ãªãè¨è¿°
Sparkã®ä¾åã©ã¤ãã©ãªãbuild.sbtã«è¨è¿°ãã¾ãã注æç¹ã¨ãã¦åãã©ã¤ãã©ãªã®è¤æ°ãã¼ã¸ã§ã³ã¸ã®åç
§ãçºçããå ´åsbt assembly
å®è¡æã«ã¨ã©ã¼ãåºãã®ã§ãMerge Strategyãè¨è¿°ããå¿
è¦ãããã¾ããèªåã®ç°å¢ã§ã¯ä»¥ä¸ã®build.sbtã§åãã¾ããã
ããã°ã©ã åã³JARãã¡ã¤ã«ä½æ
ããã°ã©ã ä½æ
ä»åã¯ãS3ãããã¡ã¤ã«ãåå¾ãã¦WordCountããã·ã³ãã«ãªããã°ã©ã ãæ¸ãã¾ããã
gzãã¡ã¤ã«ãåæã«è§£åãã¦ãããã®ã¯ä¾¿å©ã§ããããã¨ã¢ã¯ã»ã¹ãã¼ã¯ç°å¢å¤æ°ã«Exportãã¦ãåãã¾ãã
JARãã¡ã¤ã«çæ
IntelliJã®SBT Consoleãèµ·åããassemly
ã³ãã³ããå®è¡ãSUCCESSãåºãã°OK
> assembly . . . [success] Total time: 14 s, completed 2015/11/10 18:56:04
Sparkä¸ã§JARãã¡ã¤ã«ãå®è¡
JARãã¡ã¤ã«ãä»»æã®å ´æã«ç½®ãã以ä¸ã®ã³ãã³ããå®è¡ã
sudo -u hdfs spark-submit --master "local" --class sample.Sample1 /path/to/YourProgram.jar
以ä¸ã®ããã«çµæãåºåãããã°æåã§ãã
. . . total lines: 4646 Lines with Dog: 3446, Lines with Cat: 567
ã¡ãªã¿ã«yarn-clientã¢ã¼ãã§ã®å®è¡ã ã¨ä»¥ä¸ã®ãããªã³ãã³ãã«ãªãã¾ãã
sudo -u hdfs spark-submit \ --class sample.Sample1 \ --master yarn-client \ --driver-memory 1g \ --executor-memory 1g \ --executor-cores 1 \ /path/to/YourProgram.jar
çµãã
ä»åã¯ããã¾ã§ã«ãã¾ãã次åã¯yarn-clusterã¢ã¼ãã§ã®å®è¡ãSpark Streamingã®å®è¡ã試ãäºå®ã§ãã