ããã«ã¡ã¯ããªãã®å æãNTTã³ãã¥ãã±ã¼ã·ã§ã³ãºã®ã¨ãã³ã¸ã§ãªã¹ãããã£ã¦ãã西å¡ã§ãã ãã®è¨äºã¯ãNTT Communications Advent Calendar 2021 22æ¥ç®ã®è¨äºã§ãã 5åã§ããããTrinoã ãTrinoãã¯ãç°ãªããã¼ã¿ã½ã¼ã¹ã«å¯¾ãã¦ãé«éã§ã¤ã³ã¿ã©ã¯ãã£ãã«åæãã§ããé«æ§è½åæ£SQLã¨ã³ã¸ã³ã§ãã 以ä¸ã®ç¹å¾´ãæã£ã¦ãããããã°ãã¼ã¿åæãæ¯ããéè¦ãªOSS(ãªã¼ãã³ã½ã¼ã¹ã½ããã¦ã§ã¢)ã®1ã¤ã§ãã SQL-on-Anything: Hadoopã ãã§ãªãå¾æ¥ã®RDBMS(ãªã¬ã¼ã·ã§ãã«ãã¼ã¿ãã¼ã¹)ãNoSQLã¾ã§ãæ¨æºSQL(ANSI SQL)ã«æºæ ããã¢ã¯ã»ã¹ãã¯ã³ã¹ãããã«æä¾ ä¸¦åå¦çã§ããã°ãã¼ã¿ã«å¯¾ãã¦å®¹æã«ã¹ã±ã¼ã«ã¢ãã ãããé«é(hiveã®æ°åå) Netflix, LinkedIn, Salesforce, Shopif
Distributed computing (Apache Spark, Hadoop, Kafka, â¦) Advent Calendar 2021 23æ¥ç®ã§ãã ä»å¹´ã Hue ã«ã¤ãã¦æ¸ãããã¨èãã¦ãã¾ããããæè¿ SQL(Presto)ãEmbulk,Digdagã使ãæ©ä¼ãå¢ãã¦ããã®ã§è¶£åãå¤ãã¦ã¿ã¾ãã Hueã«ã¤ãã¦èå³ãããæ¹ã¯[å ¬å¼ããã°ï¼æ¥æ¬èªï¼](http://https://jp.gethue.com//posts/ âå ¬å¼ããã°ï¼æ¥æ¬èªï¼â)ãã覧ãã ããã翻訳ãµãã£ã¦ã¾ãã»ã»ã» ââââ SQLã¬ã·ãæ¬ã¨ã¯ï¼ ããã°ãã¼ã¿ç³»ã®è¯æ¸ã®ä¸åã§ããããããã°ãã¼ã¿ åæã®ããã®SQLã¬ã·ãæ¬ããåºçãã4å¹´çµéãã¦ãè²ããããã¨ã¯ããã¾ããã ãã®æ¸ç±ã§ã¯ãSQLã®è¨è¿°æ¹æ³ããåæææ³ã¾ã§åºãæ±ãããããåçµããã ãã§ã大ããªç¥è¦ãå¾ãããã§ããããä¸æ¹ãæ±ãã
Trino/Prestoããã¼ã¸ã§ã³ã¢ããããéã«ã¯äºåã«åä½ã®äºææ§æ¤è¨¼ãªã©ãè¡ãªã£ã¦ããã®ã§ãããæ¤è¨¼ä½æ¥èªä½ã¯ä»¥åPresto Conference Tokyo 2020ã§ãç´¹ä»ããã¦ããã ããquery-simulatorã¨ããå 製ã®ãã¼ã«ã使ã£ã¦èªååããã¦ãããã®ã®ãå®éã«éäºæã®æåãçºè¦ããå¾ã®åå 調æ»ï¼åå ã®ã³ããããç¹å®ãã¦ãã°ãã©ããã®å¤æãããï¼ã«ã¤ãã¦ã¯å¼ãç¶ãå°éãªä½æ¥ãå¿ è¦ãªç¶æ ã§ããã Trino/Pretsoã¯éçºãé常ã«ã¢ã¯ãã£ãã§ã1åã®ãªãªã¼ã¹ã«æ°ç¾ã®ã³ããããå«ã¾ãã¾ãã1å¹´ç¨åº¦ãã¼ã¸ã§ã³ã¢ãããæ ã£ã¦ããã ãã§ãå¤æ´ã巨大ããã¦ã³ã¼ãã®å¤æ´å±¥æ´ããåå ãç¹å®ããã®ã¯é常ã«å°é£ã«ãªãã¾ããããã§ãã¾ãã¯è¤æ°ãã¼ã¸ã§ã³ã®Trino/Prestoã§ã¯ã¨ãªã®å®è¡çµæãæ¯è¼ãããã¨ã§å¤æ´ãå°å ¥ããããã¼ã¸ã§ã³ãç¹å®ãããã®å¾ããã®ãã¼ã¸ã§ã³ã®ã³ãããã®ä¸ã
Naoki Takezoe from Treasure Data discussed testing their distributed query engine Presto as a service. They developed a tool called presto-query-simulator to test using production data and queries in a safe manner. The tool reduces testing time by grouping similar queries and narrowing data scans. It also helps analyze results and find problematic queries. Future work includes running tests more f
HiveQLã§ã¯ã¹ãã¼ãã«é£ãæãã¦ãããããç§ãPrestoã使ãå§ãã¾ããã MySQLãHiveã§ä½¿ã£ã¦ããã¯ã¨ãªãç½®ãæããæã«ããã£ãTipsãã¾ã¨ãã¦ããã¾ãã AWS Athenaã§Prestoã使ã£ã¦ããæ¹ãå¢ãã¦ãã¨æãã®ã§ãPrestoæ¨æºé¢æ°ã§ã®è¨è¿°ä¾ãæ¡å ãã¦ããã¾ãã Prestoã¨ã¯ Prestoã¯ãªã³ã¡ã¢ãªã§åãåæ£SQLã¨ã³ã¸ã³ã§ããã®é²åã¯ç®ãè¦å¼µãç©ã§ãã çºè¡¨ãããå½æã¯è²ã ãªæç´ããã使ããã¨ãèºèºãã¦ãã¾ãããã2015å¹´é ããã¯ãã使ããªãçç±ã¯ãªããªãã¾ããã ã¢ãããã¯ã«ä½¿ããã¨ã¦ãé«éãªSQLã¨ã³ã¸ã³ã§ãã®ã§ããããåãã®Hiveã®ããã«å®è¡çµæãå¾ ã¤æéã¯ã»ã¨ãã©ããã¾ããã Hiveã§ãã¨1ã¤1ã¤ã®å®è¡ã«æéãæããã®ã§ãã¯ã¨ãªã«æ £ãã¦ããªãæ°åè ã«ã¯è¾ãç©ãããã¾ããã ãããPrestoã§ã¯ã¤ã³ã¿ã©ã¯ãã£ãã«å®è¡ã§ãã¾ãã®ã§ããã©ã¤
I have been working in the big data arena for more than ten years. If you ask me what is the most popular use case in this area I have seen so far, my answer is definitely SQL for big data. Everyone likes SQL. There are so many SQL for big data solutions, including Apache Hive, SparkSQL, Impala and Presto, just to name a few. Among these solutions, Presto is becoming my favorite, not only for its
æè¿ã¯å¤§ããªãã¼ã¿ãæ±ãã¯ã¨ãªã¨ã³ã¸ã³ãããããå¢ãã¦ãã¾ããã ä¸æåã¯ãã¼ã¿ããã«ã¹ãã£ã³ãã¦å¦çãããã¿ã¼ã³ãå¤ãã£ãã®ã§ããããã¯ããã®æ¹æ³ã§ã¯å¦çã³ã¹ãã大ãããªãã®ã課é¡ã¨ãªãã¾ããã ããã§ãã¼ã¿ã®èªã¿è¾¼ã¿ã§ã®å¦çã³ã¹ããåæ¸ããããã«æè¿ã§ã¯ãã¼ã¿ãã¼ã¹ã®ããã«å¿ è¦ãªã«ã©ã ã ããèªã¿è¾¼ãã ããä¸è¦ãªè¡ã»ãã¼ã¸ãã¹ãããããããã·ã¥ãã¦ã³ã®æ©è½ãæã£ãããã°ãã¼ã¿åãã®ã¯ã¨ãªã¨ã³ã¸ã³ãå¢ãã¦ãã¾ããã ä»æ¥ã¯ Hive , Presto , Drill ã®ããã·ã¥ãã¦ã³ã«ã¤ãã¦è¦ã¦ããããã¨æãã¾ãã ããã·ã¥ãã¦ã³ã®ç¨®é¡ ããã¹ããè¡ã§ãã¼ã¿ãå¤æãã¦ãããã©ã¼ãããã®å ´åã«ã¯ãã¹ã¦ã®ãã£ã¼ã«ããå ¨ã¦èªã¿è¾¼ããã¨ã«ãªãã¾ãã ORCãParquetãªã©ã®ã«ã©ã ãã¼ãã©ã¼ãããã®å ´åã«ã¯åæ¯ã«ãã¼ã¿ãä¿æãã¦ãããããç¹å®ã®ã«ã©ã ãåã§ã®çµ±è¨æ å ±ï¼MaxãMinï¼ãªã©ãæã£ã¦
Databricksã¨Sparkã¦ãã¯ãããã [ããã°ãã¼ã¿ETLå¦ç/ãã¼ã¿å¯è¦å] å®è·µå ¥é / Databricks and Spark with ETL and Visualization
{"serverDuration": 35, "requestCorrelationId": "b45a8231fbaabbfb"}
ãã¬ã¸ã£ã¼ãã¼ã¿ã¯ã¯ã©ã¦ãã§ãã¼ã¿ããã¼ã¸ã¡ã³ããµã¼ãã¹ãæä¾ãã¦ãã¾ãã Hadoop Conference Japan 2014 以åã«åç¥ããHadoop Conference Japan 2014ã§ï¼å¼ç¤¾Software Architectã®å¤æ©ãçºè¡¨ãã¾ããã ãã¼ãã¯ï¼Facebookãå ¬éããæ°ããåæ£å¦çåºç¤ï¼Prestoãå®ã¯Facebookãå½¼ãã®è¶ 大è¦æ¨¡ãªãã¼ã¿ã»ããã«å¯¾ãã¦ã¤ã³ã¿ã©ã¯ãã£ãã«çµæãè¿ããããã«ã¨éçºããããã®ã§ããéçºãå§ã¾ã£ã¦ã¾ã 2å¹´ãçµã£ã¦ããã¾ãããï¼ä»ã§ã¯ãã¬ã¸ã£ã¼ãã¼ã¿ãåãã¨ãã¦å¤ãã®ããã«ã¼éãã³ããã¿ã¼ã¨ãã¦åå ããæ´»çºçãªããã¸ã§ã¯ãã«æé·ãã¦ãã¾ãã Prestoã¯HiveãImpalaã¨åããSQL Query Engineãã§ããï¼ç¹ã«æ°ç¾GBãè¶ ãã大è¦æ¨¡ãã¼ã¿ã«å¯¾ãã¦ãã¤ã³ã¿ã©ã¯ãã£ããªã¬ã¹ãã³ã¹ãï¼ã³ã³ã0ç§ä»¥ä¸ï¼é ãã¦
ãªãªã¼ã¹ãé害æ å ±ãªã©ã®ãµã¼ãã¹ã®ãç¥ãã
ææ°ã®äººæ°ã¨ã³ããªã¼ã®é ä¿¡
å¦çãå®è¡ä¸ã§ã
j次ã®ããã¯ãã¼ã¯
kåã®ããã¯ãã¼ã¯
lãã¨ã§èªã
eã³ã¡ã³ãä¸è¦§ãéã
oãã¼ã¸ãéã
{{#tags}}- {{label}}
{{/tags}}