ã¯ããã«
ããã«ã¡ã¯ãData Engineer ã® @shase ã§ãã
å¼ç¤¾ã§ã¯ããã¤ãã®ã¦ã¼ã¹ã±ã¼ã¹ã§Cloud Composerï¼Airflowï¼ã使ã£ã¦ããã®ã§ãããä»åã¯ãã¼ã¿ãã¼ã ã§éçºãã¦ãããåæè åãBigQuery SQLå®è¡åºç¤ï¼ç¤¾å ã®éç§°ã¯Saved Query Workflowï¼ã«ã¤ãã¦ç´¹ä»ãã¾ãã
ãã®ã·ã¹ãã ã¯ä»å¹´ã®æ¥ããåãã¦ãããã®ã§ãã
ã·ã¹ãã æ¦è¦
ä»åç´¹ä»ããã·ã¹ãã ã®æ¦è¦ã§ãã
- åæè ã¯SQLã¨YAMLãGitHubã«ã³ããããã¦PRã使ãã¾ãã
- ã¨ã³ã¸ãã¢ãã¬ãã¥ã¼ããã¾ãã
- Cloud Composerã§SQLãã¹ã±ã¸ã¥ã¼ã«å®è¡ãããçµæãGoogle Sheets ãªã©ã«åºåããã¾ãã
èæ¯
çµç¹å ¨ä½ã®KPIéè¨ãã¬ãã¼ãã£ã³ã°ã¨ã¯å¥ã«ãåæè å人ãç¹å®ã®ãã¼ã ã使ããã¼ãã«ãã¬ãã¼ãã宿çã«ä½æããå ´åãæ³å®ããã¦ã¼ã¹ã±ã¼ã¹ã¨ãã¦ãåæè åãBigQuery SQLå®è¡åºç¤ãã¤ããã¾ããã
BigQueryãæ®æ®µã使ãã®æ¹ã§ãã¨ãScheduled query ã§åé¡ãªãã®ã§ã¯?ã¨æãããæ¹ãå¤ãããããã¾ãããããããScheduled queryã§ã¯å®ç¾ã§ããªã以ä¸ã®ãããªãã¨ã®ããã«ãä»åã¯Cloud Composerã使ã£ãSQLå®è¡åºç¤ãç¨æãããã¨ã«ãã¾ããã
ï¼Cloud Composerã®æ¡ç¨èªä½ã¯ãå¼ç¤¾ã®DWHãBigQueryã«ç§»ç®¡ããProjectã§æ¡ç¨ããã¦ãããããã¨ããçç±ã大ããã§ããï¼
çã
SQLãã¡ã¤ã«ã®GitHubã§ã®ç®¡çãPull Requestã§ã®ã¬ãã¥ã¼
ã¯ã¨ãªçµæã®ä»»æã®Google Sheetsã¸ã®èªåä¿å
- ããã¯Schedule queryã ãã§ã¯å®ç¾ã§ããªãã
å©ç¨ã¤ã¡ã¼ã¸
é常ãAirflowã§ã¯ãPythonãã¡ã¤ã«ã§DAGãæ§æãã¾ãã
ä»åã®åºç¤ã§ã¯åæè ãå¿ ãããPythonã«ç¿çãã¦ããªããã¨ãæ³å®ããYAMLãã¡ã¤ã«ã¨SQLãã¡ã¤ã«ã ããæ¸ãã¦ã³ãããããã°ãBigQueryã«å¯¾ãã¦SQLãã¹ã±ã¸ã¥ã¼ã«å®è¡ããããã¨ããç¶æ ãç®æãã¾ããã
job: schedule: "0 1 * * *" tasks: - name: shase_daily_foo1_bq_check bq_sql: shase_daily_foo1_bq_check.sql bq_destination_table: "foo-sandbox.shase.foo1" bq_mode: replace spreadsheet_id: "xxx" spreadsheet_range: "Sheet1!A1" - name: shase_daily_foo2_bq_check bq_sql: shase_daily_foo2_bq_check.sql bq_destination_table: "foo-sandbox.shase.foo2" bq_mode: replace - name: shase_daily_bar_bq_check bq_sql: shase_daily_bar_bq_check.sql bq_destination_table: "foo-sandbox.shase.bar" bq_mode: append spreadsheet_id: "xxx" spreadsheet_mode: append
YAMLãã¡ã¤ã«ã§ã¯å®è¡ã¹ã±ã¸ã¥ã¼ã«ãå®è¡ããSQLãã¡ã¤ã«ãBigQueryã®outputå ãã¼ãã«ãoutputå ã®Google Sheetã®IDãªã©ãæå®ãã¦ãã¾ãã
YAMLããAirflowã®DAGã®çæ
Airflowã§ã¯åçã«DAGãçæãããã¨ãã§ãã¾ããå ç¨ã³ããããããYAMLãparseãã¦ãAirflow ã® SubDAGã¨ãã¦ãã¾ãã
UIããã®è¦ãæ¹ã¯ä»¥ä¸ã®ããã«ãªãã¾ãã
Google Sheets飿º
Google Sheets 飿ºã«ã¯ãEmbulkã®output pluginãèªä½ãã¦å©ç¨ãã¦ãã¾ãã ï¼å°ãæ¤ç´¢ããã®ã§ãããè¯ãæ¢åã®pluginããªãã£ãã¨ããã®ãããã¾ãï¼
ã¦ã¼ã¹ã±ã¼ã¹ã社å å©ç¨ã«å°ãåãããã¦ããã¨æããé¨åãããã¾ãããããããã£ãã使ã£ã¦ããã ããã¨ããããã§ã ^^
ä¸å¿ ^ ãã¡ãã§å ¬éãã¦ãã¾ãã
ä»å¾ã®èª²é¡
ç¾å¨ã¯å¾ã ã«ç¤¾å ã®å©ç¨è ãå¢ãã¦ãããã§ã¼ãºã§ãã
CIãCDã«ã¯ã¾ã ã¾ã æ¹åã®ä½å°ãããããã®ãããã¯å å®ããã¦ããããã¨èãã¦ãã¾ãã
ãã¼ã¿ãã¼ã ã§ã¯ãåæç°å¢ã®ç¶ç¶çãªæ¹åã«åãçµãã§ãã¾ã ðª