ããã«ã¡ã¯ã
ä¹ ãã¶ãã®ããã°æ稿ã«ãªã£ã¦ãã¾ã£ãã®ã§ãããä»æ¥ã¯ãã¯ã¨ãªã®åä½ãã¹ããæ¸ããã¨æã£ããã©ãå£ãåãã¦ã©ããããããªã¨æã£ã話ããæ¸ãããã¨æãã¾ãã
ã¯ããã«è¨ã£ã¦ããã¾ããããã®è¨äºã¯ç¹ã«ä½ã解決çãããããã§ããªãã§ããªããããã ãã¯ã¨ãªã®åä½ãã¹ããæ¸ããã¨ããã¨ãããªåé¡ã«ã¶ã¡å½ãã£ã¦ãããã«å¯¾ãã¦ãããããããã¨ã¯æã£ããã©ã誰ãæé«ã®ã½ãªã¥ã¼ã·ã§ã³ãªãï¼ç¬ãã£ã¦èãããã£ãããæ¸ããã ãã®è¨äºã§ãç¬*1
ãã®è¨äºã®ã¢ãã
æè¿ãBigQueryã使ã£ããã¼ã¿åæåºç¤ã¨ãããéçºãã¦ããã®ã§ãããããã¹ãæ¸ãã¦ãªãã¨ããåãã @t_wada ããã®åã§ãåããã¨è¨ããã®ï¼ãã£ã¦è¨ãç¶æ ã«ãªãããããçé¢ç®ã«ãããã¨ããã¨ã©ããªããã ãã£ã¦è¨ããã¨ã§çé¢ç®ã«å¯¾å³ãã¦ã¿ã¾ãã*2ã
ããã§è¨ããçé¢ç®ã«ãã¨è¨ãã®ã¯ããå ¥ãå¾ãå ¨ãã¿ã¼ã³ã®ãã¼ã¿ã«å¯¾ããéè¨ããã£ã¦ãããç¶æ ãæãã¾ãã
èªåãAPIã®åä½ãã¹ããæ¸ãããã¨ãã¯æ®éã«ãã£ããã¨ãããã®ã§ããããã¼ã¿éè¨ç³»ã¨ãæ©æ¢°å¦ç¿ç³»ã¨ãã®ãã¹ãã£ã¦ãã¾ããªãã¨ãããããã¿ã¾ãããæéãªãã£ããã£ã¦ãã¦ãã¾ãã*3ãæ¸ããã¨ãã¦ããä¸éå端ãªããããã ãã¯ãã£ã¦ãããã£ã¦è¨ãã¬ãã«ã§ãã
ã§ãããã£ã±ãããã ãã¯å質ãè½ã¨ãã¡ãè¡ããã*4ãã£ã¦è¨ãé¨åããã£ããå®ããããªã¨ããç¶æ³ã«åºãããã¾ãã¦ãã¾ããããã¨ãªã£ãããã§ãã
ã§ãä»åã¯è²ã æ¤è¨ããçµæããããã£ã¦ãã¡ããã¡ã大å¤ã ãã©ãã¸ã§ã©ãããã®ï¼ãã£ã¦ããã¨ããã«ã¤ãã¦æ¸ãã¦ãããã¨æãã¾ãã
åæ
ç°å¢
ç°å¢ã¨ãã¦ã¯ãBigQueryãä¸å¿ã«è©±ãã¦ããã¾ãããããæ¹ã¯éãã©èãæ¹ã¯åãã«ãªãã®ã§ã¯ãªãããªã¨æãã¾ãã
- ãã¼ã¿ã¦ã§ã¢ãã¦ã¹ï¼BigQuery
- ã¯ã¼ã¯ããã¼ï¼Airflow (Cloud Composer)
BigQueryã¯ã¨ãã¥ã¬ã¼ã¿ã¼ãåå¨ããªãã®ã§ããã¼ã«ã«ãªã©ã§ãã¹ããå®è¡ã§ãã¾ããã®ã§ãå®éã«ãBigQueryã«ãã¹ããã¼ã¿ãç¨æããã¯ã¨ãªãå®è¡ãããå¿ è¦ãããã¾ãã
ã¯ã¨ãªã®åä½ãã¹ãã§ãããã¨
BigQueryã«ã¯ã¨ãã¥ã¬ã¼ã¿ã¼ãåå¨ããªããããã¯ã¨ãªã®åä½ãã¹ããæ¸ããã¨æã£ãæã¯ã以ä¸ã®ãããªä½æ¥æé ã«ãªãã¨æãã¾ãã
- åä½ãã¹ãã®ããã«å¿ è¦ãªãã¹ããã¼ã¿ãç¨æãã
- ãã¹ããã¼ã¿ãä¸æçã«BigQueryä¸ã«ãã¼ãã«å
- ãã¹ããã¼ã¿ãå ¥ã£ããã¼ãã«ã«å¯¾ãã¦ãã¹ãã®ã¸ã§ããBigQueryä¸ã§å®è¡
- ãã¹ãã®ã¸ã§ãã®çµæããæ£è§£ã®ãã¼ã¿ã¨ä¸ç·ããå¤æãã
- å¦çãçµãã£ãããä½æãããã¼ãã«ãåé¤ãã
å®éã®å¦çã®å®è£
æ¹æ³ã«ã¤ãã¦ã¯ããã§ã¯ç´°ããè¨åãã¾ãããã大ã¾ããªããã¼ã¯ãã®ããã«ãªãããªã¨æãã¾ãã
ç«ã¡ã¯ã ãã3ã¤ã®å£
ä¸è¨ã®ãããªããã¼ãæ³å®ããæ©éãã¹ããæ¸ãã¦ãããã¨æã£ããã§ããã3ã¤ã®å£ã«ä¼ãã¾ããã
- ãã¼ã¿ãç¶²ç¾ çã«ç¨æããã®ãé常ã«ããã©ããã
- ã¯ã¨ãªããã¹ãããããããã«æ¸ããã¦ããªã
- ã¯ã¨ãªã®å¤æ´ã«ããç¨åº¦ãã¹ãã®å¦çãèãããããã«ãã
ããããã«ã¤ãã¦æ¸ãã¦ãããã¨æãã¾ãã
å£â ï¼ãã¼ã¿ãç¶²ç¾ çã«ç¨æããã®ãé常ã«ããã©ããã
ã¯ã¨ãªã¯ããåä½ã§ã¯åããªãã®ã§ããã¼ã¿ãç¨æããå¿ è¦ãããã¾ããè¾ãããªãã£ã¦æã£ã¦ãããã§ãããæ³å以ä¸ã§ããã
å¤ã®ç¯å²ãåºããã
å½ããåã§ãããBigQueryã«ãåãããã¾ãã以ä¸ã®ãããªæãã§ãã
- STRING
- TIMESTAMP
- INT64
- FLOAT64
æåã¯ããã®å¤ããã¼ã¹ã«ãã¦ããFaker.jsã¨ããå©ç¨ãã¦ããã¼ãã¼ã¿çæããã°ããããããã£ã¦æã£ã¦ãã¾ããã
ããããããã§åé¡ãçºçãã¾ããä¾ãã°ã次ã®ãããªã¯ã¨ãªã§ãã
SELECT SAFE_CAST(revenue AS FLOAT64) AS revenue -- revenueã¯STRINGã§ã FROM hogehoge
ãªãã¨è¨ããã¨ã§ããã*5ããããã¡ãªãã¨ããããStringã§å ¥ãã¦ãå¾ã§ãªãã¨ãããããä½æ¦ã§ããã¾ãããã¯ä¸æ¦ã«ééã£ã¦ããªãã¦ãæ¬å½ã«STRINGãæ¥ããã¨ãããããã§ãã
ãã ããSTRINGã ãªãã¨æã£ã¦ããã¼ã®ãã¼ã¿ãçæãã¦ããã¡ã§ãINTãFLOAT(ãã¤ãã¹ãããã¼ãããã©ã¹ã¾ã§)ã®å¤ãSTRINGã®é¡ããã¦å ¥ãã¦ãããªãã¨è¨ããã¨ã§ããSTRINGãããã¨è¨ããã¨ã¯ã空ç½æååãæ³å®ããªãã¨ãããªãããã¨ããããã¾ãã
ãªã®ã§ãæ®éã«æååã ããæååã ãçæãã¦ããã¡ã ããéã«æ°å¤ããæ¥ãªãããã£ã¦ãããæ¬ã£ã¦ã¯ãããªãã¿ãããªæãã§ããè¾ãã
ãã¼ã¿ãç¨æããã©ã¤ãã©ãªããªã
ã¾ããä½ããã£ã¦è¨ããããããã¾ã§ãªãã§ãããæå¤ã«ããã©ããããæ¢ãã¦ã¿ããã©ãã俺ã欲ãããã®ã¯ãªãã£ããã¨è¨ãæãã§ããåä½ãã¹ãããããªãããå°ãè¸ãå¼µã£ã¦æ¸ããã°ãã£ã¦è¨ãæ°æã¡ã«ãªãã¾ããã
åã«æ²¿ã£ã¦ãã¼ã¿ãçæãããã®ã¯ãã£ãã®ã§ããã欲ããã£ãã®ã¯ã
- ç¨æããããã¼ã¿ã¯æ±ºã¾ã£ãã«ãã´ãªã¼ã®å¤ããå«ã¾ããªãSTRINGã ã£ããã
- ã¼ã以ä¸ã®æ´æ°ã ãã¨ãã
ããè¨ã風ã«ããæãã§BigQueryã®ã¹ãã¼ãã«ãã©ã¹Î±ã§æå®ãããããæãã§ãã¼ã¿ãçæãã¦ããããã®ã§ãã
ã«ã©ã æ°ãå°ãªãå ´åã¨ãã¯ã¾ãæåã§ããªãã¨ããªãããã§ããããããããªãã¨çµæ§ããã©ãããããªãã£ã¦ãªãã¾ãã
å£â¡ï¼ã¯ã¨ãªããã¹ãããããããã«æ¸ããã¦ããªã
ããã§ã¯å®è¡ã®é度ã¨ãã¯ä¸æ¦ç¡è¦ãã¦è°è«ãã¾ãããã¯ã¨ãªã®ãã¸ãã¯ããã¡ãã¨åããã¦ããã(withå¥ã§åããã¨ããããããdatalake/dwh/datamartã®ã¬ã¤ã¤ã¼ã§åããã¦ããªãã¨ã)ããã¹ãã®ããã«ç¨æãããã¼ã¿ãå¤ã«è¤éåããããè¦éããããããã¨ãããã¾ãã
å ç¨ã®SAFE_CASTã¨ããããã§ãããä¸çªæåã®ãã¼ã¿ãåå¾ããã¬ã¤ã¤ã¼ã¨ãã§åã®å¤æå¦çãã§ããéããã£ã¦ããã¨ãããã®å¤æãåãã¦ããã¨ãã ãã§ãããã©ã®ãããªãã¼ã¿ãæ¥ããã¨ãæå¾ ãã¦ããããããããããããªããå¾ã«å¯¾å¦æ¹æ³ã§ã話ãããã¼ã¿ã¨ã¯ã¨ãªã®ãã¹ããåãããã®ã楽ã«ãªãã¾ãã
å®éã®ã¯ã¨ãªã¯ãããªã£ã¦ããªããã¨ã®æ¹ãå¤ãã®ã§ãã¾ãã¯å¦çã®ããã¼ãè¦éãããããæ¹ããçæããæãã«ãªãã¾ãã
å£â¢ï¼ã¯ã¨ãªã®å¤æ´ã«ããç¨åº¦ãã¹ãã®å¦çãèãããããã«ãã
ããã¯ãå£â¡ãã®è©±ã«ãã¤ãªããã®ã§ããããã¹ãã®å¦çã®é½åä¸ãComposerãªã©ã§å®éã«å¦çã®ä¸ã§å©ããã¼ãã«ã¨ã¯éããã®ã«ãªãã¾ãã
ãªã®ã§ãããã°ã©ã ä¸ã§å¤æããã¦ãããå¿ è¦ããããããªã®ã§ãããã¾ããã¼ãã£ã·ã§ã³ãã¼ãã«ã ã£ãããããããªãã£ãããç°å¢ãã¨ã«å¤ãå¤ããããã«ãªã£ã¦ãããã¨ãããããã§ããããã辺ãã£ã¡ãèãããããã«ãæåããã¯ã¨ãªãæ¸ããã¦ããã°OKãªã®ã§ããããããããªãã£ããããã®ã§ç¾å®ã¯å¤§å¤ã§ãæ±ã
対å¦æ¹æ³
ã£ã¦ãªããã§ãã¯ã¨ãªã®åä½ãã¹ããæ¸ãã¨ãã®å£ãæ¸ãã¦ããã®ã§ãããããã«å¯¾ãã¦ã©ã®ãããªå¯¾å¦æ¹æ³ããã¦ããã®ãè¯ãããããèãã¦ã¿ã¾ããã
â ï¼ããããå ¥å£ãã¡ããã¨ãã
å ãåããªã話ã§ãããå¾è¿°ãã対å¦æ¹æ³ãèããã¨ããdatalakeãã®ã¬ã¤ã¤ã¼ã«ãããã¼ã¿ãããæãã§ããã°ããã»ã©æé«ãªããã§ãã
ãããªã±ã¼ã¹ã®æ¹ãå°ãªãã¨æãã®ã§ãä¸æ¦ããã¾ã§ã«ãã¦ããã¾ããããã¸ã§ãããå¹ãã¦ãããã ãªãã¨è¨ãæ°æã¡ã§ãã
â¡ï¼å¤ããã£ã«ã¿ã¼ããå¦çã¨ã¯ã¨ãªã®éè¨ãã¸ãã¯ãåãã
ç°å¸¸å¤ãå ¥ã£ã¦ããå¯è½æ§ããããã¨ãæ³å®ããã®ã¯è¯ããã¨ã§ãããããã¨éè¨ãã¸ãã¯ããã£ã¡ãã«ããã¨ããã¹ããããä¸ã§ãã¾ãåãåããã§ããªããªãã¾ãã
ãªã®ã§ããå¤ããã£ã«ã¿ã¼ããå¦çãã¨ãã¯ã¨ãªã®éè¨ãã¸ãã¯ããã§ããéããã¼ãã«ã§ãã£ãããåãã¯ã¨ãªå ã§ãã£ãã¨ãã¦ãããããããåããããããã¨ã§ããéè¨ãã¸ãã¯ã«å¯¾ãã¦ã®ãã¹ãã¯ããç¨åº¦æå¾ ããããã¼ã¿ã®ã¿ã¨ãããã¨è¨ãããã«ã§ãã¾ã*6ã
以ä¸ã§è¿°ã¹ãæå¹ç¯å²ã®ãã¹ãã¨ã®åãåããè¦éããè¯ããªãããã¹ãé²ãããããªãã¾ãã
â¢ï¼å¤ã®æå¹ç¯å²ã®ãã¹ãã¨ã¯ã¨ãªã®éè¨ã®ãã¹ããåãã
ãããå½ããåã ãã£ã¦æ°æã¡ããã§ãããå¤ã¨ãã¦ã¯å ¥ã£ã¦ããå¯è½æ§ãã¼ãã¨ã¯è¨ããªãã±ã¼ã¹(ã¢ããªã±ã¼ã·ã§ã³ã®ä»æ§ãå¤ãã£ãããã¹ã£ã¦ãã¼ã¿ãéã£ã¦ãã¾ã£ããç)ã¯ããã¨ãã¦ãããã®å¤ãæ¥ããã¨ã«æ°ã¥ããã¹ããã¨ãã¯ã¨ãªã®éè¨ãæ£ãããã®ãã¹ãããåãããã¨ã§ãç°¡åã«ãªãã±ã¼ã¹ãããã¾ãã
ä¾ãã°ã以ä¸ã®ããã«ãã¹ããåããã¨ãã§ãã
- å¤ã®æå¹ç¯å²ã®ãã¹ãï¼ã¦ãã¼ã¯å¶ç´ãEnumåã®ãã§ãã¯ãªã©
- ã¯ã¨ãªã®éè¨ã®ãã¹ãï¼ãã¸ãã¯é¨åã«ã®ã¿ãã©ã¼ã«ã¹ãå½ã¦ããã§ãã¯
ãã®åãåãããããã¨ã§ãã¯ã¨ãªã®éè¨ã®ãã¹ãã®é¨åã§ã¯ããã¾ãã¾ãªå¤ããããã¨ãæ³å®ãããã¹ããããªãã¦è¯ããªãããã¼ã¿ãç¨æããæéãçãã¾ãã
ã¯ã¨ãªã®éè¨ã®ãã¹ãã ãã§ã大å¤ã§ãããããç¨åº¦æ£å¸¸ç³»ã«è½ã¨ãè¾¼ããã¨ãã§ããã®ã§ãããã¯ç¾å®çã«ã¯æå¹ãªæ段ã¨æãã¾ãã
ã¾ã¨ã
ãã¹ããã©ã¯ãã£ã¹ã¯ã¾ã ããåãã£ã¦ã¾ããããããç¥ã£ã¦ããã£ãããæ¹ã¨ããããããã²æãã¦æ¬²ããã§ãï¼
*1:ã©ãªããããç¥æµããã ããç¬
*2:ä»æ´
*3:æ£ç´è
*4:ã¾ãå ¨é¨è½ã¨ãã¡ããããªããã§ããã©ãåªå 度ã£ã¦ãã¤ã§ããã
*5:ããã©ã¼ã¢ãã¿ã¼
*6:ãã¡ãããããã§ããã®ãã¨è¨ãã®ã¯ããã¤ã¤ãç¾å®åé¡ã³ã¹ããè¯ããªãã®ã¯äºå®ã ã¨æãã¾ã