ããã«ã¡ã¯ãã¨ã ã¹ãªã¼ ã¨ã³ã¸ãã¢ãªã³ã°ã°ã«ã¼ã ã®é³¥å±± (@to_lz1)ã§ãã
ã½ããã¦ã§ã¢ã¨ã³ã¸ãã¢ã¨ã㦠製è¬ä¼æ¥åããã©ãããã©ã¼ã ãã¼ã / é»åã«ã«ããã¼ã ãå ¼ä»»ãã¦ãã¾ãã
ã½ããã¦ã§ã¢ã¨ã³ã¸ãã¢ã¨ããè©æ¸ãã§ã¯ããã¾ãããç§ã¯è£½è¬ä¼æ¥åããã©ãããã©ã¼ã ãã¼ã ã§é·ãããã¼ã¿åºç¤ã®æ´åã»æ¹åã¨ãã£ããããã "ãã¼ã¿ã¨ã³ã¸ãã¢" ãè¡ãæ¥åã«ãåãçµãã§ãã¾ããã
æ¬æ¥ã¯ãã®è¨è¨æã«èãã¦ããã㨠/ èãã¦ãããã¨ããã¼ã¿åºç¤ã®è¨è¨ãã¿ã¼ã³ã¨ããå½¢ã§ãç´¹ä»ããããã¨æãã¾ããå¤ãã®ä¼æ¥ã§å¿ è¦æ§ãèªèãããããã«ãªã£ã¦ä¹ ãã "ãã¼ã¿åºç¤" ã§ãããã¾ã ã¾ã 確ç«ãããç¥è¦ã®å°ãªãé åãã¨æãã¾ããå°ãã§ããã¼ã¿ã¨ã³ã¸ãã¢ãªã³ã°ãè¡ãæ¹ã®æ¥åã®åèã«ãªãã°å¹¸ãã§ãã
- ãã¼ã¿åºç¤ã®å ¨ä½å
- ããã¼ã¿åºç¤è¨è¨ãã®ãã¿ã¼ã³ã¨é²ãæ¹
- 1. åªçãªã¯ã¼ã¯ããã¼ãçµã
- 2. ãã¼ã¿ãã¼ãå©ç¨è ã¨ååãã¦ãã¼ãã«è¨è¨ãè¡ã
- 3. 鿣è¦åãã¼ãã«ãå¹çè¯ãæ´æ°ãã
- 4. æ°ãã¤ããã¹ãã¢ã³ããã¿ã¼ã³
- ä»å¾ã®å±æ
ãã¼ã¿åºç¤ã®å ¨ä½å
ã¯ããã«ã製è¬ä¼æ¥åããã©ãããã©ã¼ã ãã¼ã ã®ãã¼ã¿åºç¤ã®å ¨ä½åã示ãã¾ãã

å³ä¸ã®ãªã¬ã³ã¸è²ã®ç¢å°ããã¼ã¿ã®æµãã示ãã¾ãããã£ããã¨ã¯BigQueryã«ãã¼ã¿ãéãã¦ãããåéãã®æ©è½ãããã¦åéãããã¼ã¿ãå å·¥ãã¦é©åãªç¯å²ã«å ¬éã»æä¾ãã¦ãããæ´»ç¨ãã®æ©è½ã«åãããã¾ãã
åéé¨åã®æ§æ
ãã¼ã¿ãåéããã¨ä¸å£ã«è¨ã£ã¦ãã大ããåããã¨
- RDBãã¼ã¿
- ãã°ãã¼ã¿
ã®ï¼ç¨®ããããããããã®ç¹æ§ã«å¿ããåéããã¼ãæ´ããå¿ è¦ãããã¾ãã
RDBãã¼ã¿
ã·ã¹ãã ãæã¤RDBã«æ ¼ç´ããããã¼ã¿ã§ãã
ã»ã¨ãã©ã®ã±ã¼ã¹ã§ã¯æ¬çªç¨¼åãã¦ããã¨ããã«è² è·ãæãããã¨ã¯ã§ããªãã®ã§ããªã¼ãã¬ããªã«ãããããå¦çã§ãã¼ã¿ãæ½åºãããããã®ãä¸è¬çãªã¢ããã¼ããã¨æãã¾ãã
ç§ã®ãã¼ã ã®ãããã¯ãã§ã¯ãAWSä¸ã«æ§ç¯ãã Digdag + Embulk ã® ECS ãµã¼ãã¹ãç¨ãã¦ãã¼ã¿ã転éãããã¨ãå¤ãã§ãããã®VPC㯠Direct Connect ã§ãªã³ãã¬ãã¹ç°å¢ã¨ç¹ãã£ã¦ããããããªã³ãã¬ãã¹ä¸ã®ã·ã¹ãã ããããã©ã¤ãã¼ããªãããã¯ã¼ã¯ãéãã¦ãã¼ã¿ãéä¿¡ã§ãã¾ã(æéãããã®è»¢ééãè¨ãã¿ãããªãããã«æ³¨æããå¿ è¦ã¯ããã¾ã)*1ã

ãã°ãã¼ã¿
ãã¼ã¸ãã¥ã¼ãªã©ã®ã¢ã¯ã»ã¹ãã°ã¯ãGoogle Analytics ã Adobe Analytics ãªã©ã®ã¢ã¯ã»ã¹è§£æãã¼ã«ãç¨ãã¦åéãã¦ãã伿¥ãå¤ãã®ã§ã¯ãªãã§ãããããå¼ç¤¾ã§ã Google Analytics ãå©ç¨ãã¦ããä¾ã¯ããã¾ãããæ¨ªæãã¼ã ã§ããåºç¤ãã¼ã ã®æ¹ã§æ´åãã¦ããã¦ããç¬èªãã©ããã³ã°ãã¼ã«ãåå¨ãã¦ãã¦ã m3.com ã§ã¯ãããå©ç¨ãã¦ããããã¾ãã
ä»çµã¿ã¯æ¯è¼çã·ã³ãã«ã§ãããã³ãã¨ã³ãããéä¿¡ãããªã¯ã¨ã¹ãã Cloud Dataflow çµç±ã§ BigQuery ã«éã£ã¦ãã¾ãã
ã¾ãããããã¯ãã«ãã£ã¦ã¯ãã°ãã¡ã¤ã«ãåæã«å©ç¨ããã±ã¼ã¹ãããã¨æãã¾ããç§ã®ãã¼ã ã®ä¸é¨ãããã¯ãã§ããFluentd ãä»è¾¼ããã¨ã§ãªã¢ã«ã¿ã¤ã ã§ã¢ããªã±ã¼ã·ã§ã³ãã°ããã¼ã¿åºç¤ã«è»¢éãã¦ãã¾ããæè¿ãã®ä»çµã¿ãå«ããããã¯ããAWSç§»è¡ããã®ã§ãããç§»è¡å¾ã®æ§æã«ã¯ AWS FireLens ãæ¡ç¨ãã ECS Task ä¸ã®ãµã¤ãã«ã¼ã¨ãã¦é ç½®ãããã¨ã«ãã¾ããã

ãã®ããã«ããã¼ã¸ãã¥ã¼ãã³ã³ãã¼ã¸ã§ã³ã®ä¸é¨ã¯ãã°ãã¼ã¿ããæ½åºã»åæã§ãã¾ãããã¹ããªã¼ã å¦çããã¾ãæ´»ç¨ãããã¨ã§ã»ã¼ãªã¢ã«ã¿ã¤ã ã§ãã¼ã¿ãåºç¤ã«è»¢éã§ãã¾ããããããä»çµã¿ã¯ä¸åº¦æ´ããã¨ã»ã¨ãã©ä¿å®ã®å·¥æ°ãããããããã¤å©ä¾¿æ§ãé常ã«é«ãããæ©æã«æ´åããã»ã©ã¡ãªããã大ããã¨æãã¾ãã
æ´»ç¨é¨åã®æ§æ
ãã¼ã¿ãé¦å°¾è¯ãéããããããããããã¯æ´»ç¨ã®ä»çµã¿ãæåãæ´ãã¦ãããã¨ãå¿ è¦ã§ãã
ãã¼ãã¼ã¿ãéã¾ã£ã¦ããã ãã§ã¯ããã¸ãã¹é¨éãæææ±ºå®ã«ä½¿ãããã¼ã¿åºç¤ã§ããã¨è¨ããã¨ã¯ã§ãããå¤ããå°ãªãããã¼ã¿ãå å·¥ãã¦ãããããã¼ã¿ã¦ã§ã¢ãã¦ã¹ã»ãã¼ã¿ãã¼ããä½ã£ã¦ããå¿ è¦ãããã¾ãã
ããããå å·¥ãã¼ãã«ã®åé¡ã«æç¢ºãªæ£è§£ã¯ããã¾ããããããã¼ã¿ã¬ã¤ã¯ãããã¼ã¿ã¦ã§ã¢ãã¦ã¹ãããã¦ããã¼ã¿ãã¼ããã®3åé¡ã«åããèãæ¹ã¯å人çã«åèã«ãã¦ãã¾ãã
("ãã¼ã¿ã¬ã¤ã¯" ã¨ã¯)
å ã®ãã¼ã¿ãå å·¥ãããã®ã¾ã¾1ã¤ã®ã·ã¹ãã ã«éç´ãããã®ã§ãã ãã¼ã¿ã½ã¼ã¹ï¼æ°´æºï¼ããæµãã¦ãããã¼ã¿ããã®ã¾ã¾èããå ´æãªã®ã§ã¬ã¤ã¯ï¼æ¹ï¼ã¨å¼ã³ã¾ãã
("ãã¼ã¿ã¦ã§ã¢ãã¦ã¹" ã¨ã¯)
è¤æ°ã®ãã¼ã¿ãçµ±åã»èç©ãã¦ãæææ±ºå®ã«æ´»ç¨ã§ããããã«æ´çãããã®ã§ãã 大éã®ãã¼ã¿ãæå³ã®ããå½¢ã§ç®¡çãããã¨ããã¦ã§ã¢ãã¦ã¹ï¼å庫ï¼ã¨å¼ã³ã¾ãã
("ãã¼ã¿ãã¼ã" ã¨ã¯)
ç¹å®ã®å©ç¨è ã»ç¨éåãã«ãã¼ã¿ãå å·¥ã»æ´çãããã®ã§ãã ããã«ä½¿ãã宿åãåãæãã¦ãããã¨ãããã¼ãï¼å¸å ´ï¼ã¨å¼ã³ã¾ãã
ãã ãã¨ã ã¹ãªã¼ã§ã¯ã©ã¡ããã¨ãã㨠"2層" ã«è¿ãæ§é ã«ãã¦ãã¾ããã¤ã¾ãå è¿°ã®ã¹ããªã¼ã å¦çã»ãããå¦ç½®ãçµã¦ãã¼ã¿ãæ¿å ¥ããå ãããã¼ã¿ã¬ã¤ã¯ãããã¼ã¿ã¬ã¤ã¯ã®ãã¼ã¿ãå ã«åå©ç¨é¨éãã¨ã«æé©åããããã¼ãã«ãåãæãããï¼ãã¼ã¿ã¦ã§ã¢ãã¦ã¹ å ¼ï¼ãã¼ã¿ãã¼ããã¨ä½ç½®ä»ããããããå¥åã® BigQuery ãã¼ã¿ã»ããã¨ãã¦éç¨ãã¦ãã¾ãã

ãã¼ã¿ãã¼ãã®å®ä¾
2å±¤æ§æã«ãªã£ãã®ã«ã¯ããã¤ãçç±ã¨èæ¯ãããã¾ãããä¸è¨ã§è¨ããªã ãçµ±åã»èç©ãããããã¼ã¿ããã¡ã¤ã³ç¥èããå©ç¨è ãã¨ã«ç°ãªããã ã¨ãããã®ã«éç´ããã¾ãã
ä¾ãã°ã以ä¸ã®ãããªç®çãæã£ããã¼ã¿ãã¼ã (= BigQuery ãã¼ã¿ã»ãã) ãä½ãã¨ãã¾ãã
ãããã¯ãããã¼ã¸ã£ã¼(PdM)åããã¼ã¿ãã¼ã:
- ãã©ãããã©ã¼ã ãµã¤ãä¸ã§å±éããè¤æ°ãããã¯ããæ¨ªæãã¦ããµã¼ãã¹å ¨ä½ã®ç¾ç¶ãåæããã
- åãããã¯ãã®å£²ä¸å®ç¸¾ã¨é£åããå½¢ã§ãææ¡ããã
ç¹å®ãããã¯ãæ å½è åããã¼ã¿ãã¼ã:
- èªåãæ å½ããã¯ã©ã¤ã¢ã³ã伿¥æ§ã«ç¹åããã詳細ãªãã¼ã¿ãåæãã¦æ¹åã«æ´»ãããã
- éã«ãæ å½å¤ã¯ã©ã¤ã¢ã³ãã®ãã¼ã¿ãã¯è¦ãããã¹ãã§ã¯ãªã
ä¸è¨ã®è¦ä»¶ããå°ãåºãããããããã® "ããã¹ãå§¿" ã¯ããããªãã«ç°ãªãã¾ãã
ã¾ãPdMåãã®ã±ã¼ã¹ã§ã¯è¤æ°ãããã¯ãã®ãã¼ã¿ãçµ±åãã¦ããå¿ è¦ãããã¾ããããã売ä¸ãã¼ã¿ãåããã¼ã¿ã»ããã§è¦ããæ¹ãé½åãè¯ãã®ã§ãSalesforce ãªã©ã®ãã¼ã¿ãã¹ã³ã¼ãã«å ¥ãããã¨ã«ãªãããããã¾ãããåæä¸ä¸è¦ãªæ å ±é ç®ã¯æå³çã«é¤å¤ããæ¹ãè¯ãã§ãããã
䏿¹ãåå¥ãããã¯ãã®æ å½è ãå©ç¨ãããã¼ã¿ãã¼ãã§ã¯ããããã¯ããæ¨ªæããå¿ è¦ããå°ãªããã®ã®ãã¯ã©ã¤ã¢ã³ãã¨ã®ã³ãã¥ãã±ã¼ã·ã§ã³æ¥åã«ç´æ¥å©ç¨ã§ããã¬ãã«ã®ãã¼ã¿é ç®ãæããå¿ è¦ãããã¾ãã
ããã§ãã¦æ å½å¤ã®ãã¼ã¿ã®é²è¦§ãé²ãè¦ä»¶ãããã¾ããããã¯ä¾ãã°æ å½è ã¨ã¯ã©ã¤ã¢ã³ãã¨ã®æ å½è¡¨ãGoogle Spread Sheetã§ç®¡çãã¦ãããããã®ãã¼ã¿ã BigQuery å¤é¨ãã¼ãã«ã¨ãã¦èªã¿è¾¼ããã¨ã§å®ç¾ã§ãã¾ãã

ä¸ã«ãç´¹ä»ãããããªç¸ç°ãªãè¦ä»¶ãããå ´åããå ±éé¨åããæ½åºããã®ã¯æå¤ã«å®¹æã§ã¯ããã¾ãããã§ãã®ã§ãç§ã®ãã¼ã ã§ã¯ãããã "DWH" ã«ç¸å½ãã層ã¯ãã¾ãç©æ¥µçã«è¨ãã¦ããªãã®ãç¾ç¶ã§ãã
ããããã«éè¤ããã¯ã¨ãªãã¸ãã¯ãå°ãçºçãã¾ããã
- æä¾ãªã¼ãã¿ã¤ã ãéããªã
- å®éã«ä½¿ã£ã¦ããã£ã¦å¹æã享åããå©ç¨è ã®ãã£ã¼ãããã¯ã§è¨è¨ãæ´ç·´ã§ãã
ã¨è¨ã£ãé¢ããããç¾ç¶ã¯ããããã¡ãªããã®æ¹ãåªã£ã¦ãããã¨æã£ã¦ãã¾ã*2ã
ããã¼ã¿åºç¤è¨è¨ãã®ãã¿ã¼ã³ã¨é²ãæ¹
以ä¸ã§ã¯ãããå°ã詳細ã«ç«ã¡å ¥ã£ã¦ã¯ã¼ã¯ããã¼ããã¼ãã«ãè¨è¨ããéã«ããç¨ãããã¿ã¼ã³ãé²ãæ¹ããç´¹ä»ãã¾ãã
1. åªçãªã¯ã¼ã¯ããã¼ãçµã
å質ã®é«ãã·ã¹ãã ãçµãã«è¶ãããã¨ã¯ãªãã§ããã100%ã®å¯ç¨æ§ãéæããã®ã¯ç¾å¨ã®æã ã«ã¯é£ããã§ããæ ã«ãã¨ã©ã¼ãªã«ããªã楽ã§ãããã¨ãã¯ã·ã¹ãã ã«æ±ããããé常ã«éè¦ãªè¦ç´ ã®ä¸ã¤ã§ãã
ãã®ãETLãè¡ããããã¸ã§ããããã®å¾©æ§ãã«ããã¦å¿ è¦ã¨ãããã®ããåªçæ§ãã¨ããæ¦å¿µã§ãããã¼ã¿åºç¤ã«éããå¿ è¦ã¨ããããã¨ã®å¤ããªã£ã¦ããæ¦å¿µãªã®ã§ãèãæ £ããæ¹ãå¤ãããããã¾ããã
ä¸è¨è¨äºä¸ã§ã¯ãã¨ã©ã¼ãçºçããè¡ãå¥ã®ã¹ãã¢ã«ã¨ã©ã¼è¡ã¨ãã¦ä¿åããªã©ã®ã¢ããã¼ããç´¹ä»ããã¦ãã¾ãããããããæ©æ§ããªãã¨ãå¦çãåªçã«è¿ã¥ãããã¨ã¯å¯è½ã§ããä¾ãã°ã
- ãã¼ãã«ã飿ºããå¦çã§ã¯ã(追è¨ã§ãªã) æ´ãæ¿ããè¡ã
- ä¾åé¢ä¿ãããå¦çã¯ã䏿¬ã§ãªãã©ã¤ã§ããããã«ãã¦ãã
ãªã©ã®ææ³ãããã¾ãã
åè
ã¯ä¾ãã° Embulk ã® mode: replace ãå©ç¨ãã¦å®ç¾ãã¦ãã¾ãããããã¦å®è£
ããã Embulk ã¸ã§ãã¯å¤±æå¾ã®ãªã«ããªã§è¤æ°åå®è¡ãã¦ãæçµçãªçµæãå¤ãããªãããåªçã§ãã
å¾è
ã¯ä¾ãã°ä¾åé¢ä¿ãããå¦çã1ã¤ã® .dig ãã¡ã¤ã«ã«æ¸ãã¦ããããªã©ã®å·¥å¤«ã§å®ç¾ã§ãã¾ããåã
ã®ã¿ã¹ã¯éã«ä¾åããã£ãã¨ãã¦ãããã¯ã¼ã¯ããã¼åä½ã§åªçããªç¶æ
ããä¿ã£ã¦ããã°ãé害çºçæã«ã¯ãã®ã¯ã¼ã¯ããã¼ãã¨ãªãã©ã¤ãããã¨ã§å®¹æã«ãªã«ããªã§ãã¾ãã
Workflow definition — Digdag 0.10.2 documentation
å¿ç¨: 妥åã調æ´ãå¿ è¦ãªã±ã¼ã¹
ã¨ã¯ãããä¸è¨ã®ãããªæ¹éã ãã§ã¯ã¯ãªã¢ã§ããªã課é¡ã«ã¶ã¤ãããã¨ã¯ããããã¾ãã
a. ãã¼ã¿éã大ãã
ãã¼ã¿éãæ¥µãã¦å¤§ãããã¼ãã«ã«å¯¾ãã¦å ¨ä»¶æ´ãæ¿ãã®ãããªæ½åºã»å å·¥å¦çãæ¯æ¥è¡ããã¨ããã¨
- ããããETL/ELTãçµãããªã
- ã³ã¹ãå¹çãèããæªã
ãªã©ã®ãã¡ãªãããèµ·ããã¾ãã飿ºå ã®ãã¼ãã«ã巨大ã§ããå ´åãã¨ã ã¹ãªã¼ã§ã¯ãããæ°è¦ä½æã»æ´æ°ããããã¼ã¿ãæ¥ä»å¥ã®ãã¼ãã«ã«é£æºãããã¨ããææ³ãåã£ã¦ãã¾ãã

ãã1æ¥ã«æ´æ°ãããRDBãã¼ã¿ã ãã転éããã°ãããã©ã¼ãã³ã¹åé¡ãããç¨åº¦é«ã確çã§é²ãã¾ãã
ããããä¸è¨æ¹éã§ã¯åºæ¬çã«ãã¼ãã«éã«ã¬ã³ã¼ãã®éè¤ãããå¾ããã¨ã«ãªãã®ã§ãéè¤æé¤ãã¦ææ°ã®ã¬ã³ã¼ãã ãã«ãããããªå¦çãå ã®ã¬ã¤ã¤ã§è¡ãå¿ è¦ãããç¹ããã¡ãªããã«ãªã£ã¦ãã¾ãã
ã¾ããé廿¥ã®ãã¼ã¿ã®å飿ºãå¾ããå¿ è¦ã«ãªã£ãå ´åã¯ãå³å¯ãªæå³ã§ã®åªçæ§ã¯å´©ãã¾ãã

ãã£ã¨ãããã®ã¢ããã¼ãã«ããã¦ãå¤ããªã£ãã¬ã³ã¼ããã¯åçä¸ãã以éã®ã©ããã®ãã¼ãã«ã«æ ¼ç´ããã¦ãã¾ããããç¾å®çã«ã¯ãã¾ãåé¡ã«ãªããã¨ã¯ããã¾ããã
b. èªåã®ãã¼ã ãè¶ ããç¯å²ã«ä¾åé¢ä¿ããã
åªçãªã¯ã¼ã¯ããã¼ãçµãã! ã¨æã£ã¦ããæå¤ã¨ç¾å®ã¯ããã§ã¯ãªãã£ããã¨ãããã¨ãããã¾ãã
ä¾ãã°ããã¼ã¿æ½åºå ã®DBã®æ´æ°ã«é å»¶ãçºçããå ´åã§ããã¼ã¿åºç¤ã¯å®å®ãã¦ç¨¼åããã§ãããã? ååå ã®ä»ã®ãã¼ã ã¨é£æºãã¦ãã¼ã¿ãã¼ããçµãã§ããå ´åããã®ååå ã®ãããã失æããã?
æ¸ãã¦ãã¦è³ãçããªããããªè©±ã§ãããããããå®éã«æ¤è¨ãå¿ è¦ã«ãªã£ãå®ä¾ã§ããå°éã§ããã åæ®µãçµããã¾ã§ãåæã«å¦çãå§ããªã(=å¾ ã¡å¦çãå ¥ãã) ã¨ããæ¹éãå¾¹åºããã®ãæå¹ã§ãã
Digdag ã«ã¯å¾
ã¡å¦çãå®ç¾ããããã®æ©æ§ã (require ãªã©) ããã¤ãããã¾ãããå¾
ã¤ããã®å¯¾è±¡ãã¼ãã«ãããããä»ã®ãããåºç¤ã§ç®¡çããã¦ããå ´åã¯ç´ ç´ã«ã¯è¡ãã¾ããã
ç§ãã¡ã®ãã¼ã ã§ã¯ãDeNAãããOSSã¨ãã¦å ¬éãã¦ãã Digdag Plugin ãå°å ¥ãããã¨ã§ãã®åé¡ã解決ãã¦ãã¾ãã
æ¬ Plugin ã使ãã°ãã¼ãã«åãæ¸ãã ãã§å¾
ã¡å¦çãå®ç¾ã§ããã®ã§ã.dig ãã¡ã¤ã«ç¾¤ã§ç®¡çããããªãé¨åã®å¾
ã¡åããã®è¨è¿°ã容æã§ãã
ã¾ããå¾ ã¡ãçºçãããããã®ãã¨ã«éç¥ã§æ°ä»ãå¿ è¦ãããã¾ãããã®èª²é¡ã«å¿ãããããæåå³åã§ãã Digdag 㨠Sentry ã飿ºãã Plugin ãèªä½ãããããæ¬çªç°å¢ã«ãå°å ¥ãã¾ããã
ãã®ããã«ãããã¨ã§ãå ¨ã¦ãä¸å¤®é権çã«ç®¡çãããããªè¦å´ãé¿ãããã¼ã ãã¨ã®èªå¾æ§ã¨ã·ã¹ãã å ¨ä½ã®å®å®æ§ã両ç«ããªãããã¼ã¿åºç¤éç¨ãã§ãã¦ãã¾ãã
2. ãã¼ã¿ãã¼ãå©ç¨è ã¨ååãã¦ãã¼ãã«è¨è¨ãè¡ã
ãã¼ã¿ãã¼ããæ´åããããã¨ãã課é¡ã«ã¯å¤ãã®å ´åããã¼ã¿ãæ´çããã¦ãããå°ã£ã¦ããä¾é ¼è ããããã®ã§ããã®å©ç¨è ã¨å¯ã«é£æºãã¦è¨è¨ãé²ããã®ãè¯ãã¨æãã¾ãã
ãã¡ã¯ã(fact)ã¨ãã£ã¡ã³ã·ã§ã³(dimension)ã¨ããæ¦å¿µããã®åéã§ã¯é »åºã®ç¨èªã§ãã
facts are measurable data about the event.
ãã¡ã¯ãã¯ããã¤ãã³ãã«é¢ããè¨æ¸¬å¯è½ãªãã¼ã¿ãã§ãããWebãµã¼ãã¹ã§è¨ãã³ã³ãã¼ã¸ã§ã³ã®æ¦å¿µã¨è¿ãã§ãã
å¤ãã®å ´åãå¾è¿°ãããã£ã¡ã³ã·ã§ã³ã«ç¹ããå¤é¨ãã¼ãåæã«æã¡ã¾ããä¾ãã°ã"è³¼å ¥" ã¨ãããã¡ã¯ãã«ã¯ååã®ID, è³¼å ¥è ã®é¡§å®¢ID, è³¼å ¥ãããåºèã®IDãä»éããããããã¾ãããã¾ããããã¤ãã³ãã«ã¯ãã®çºçæå»ãç´ã¥ãã®ã常ã§ããããã確å®ã«åéããå¿ è¦ãããã¾ãã
Dimensions are the actors or attributes
ãã£ã¡ã³ã·ã§ã³ã¯ãã¢ã¯ã¿ã¼ãã¾ãã¯å±æ§æ å ±ã®éã¾ãã§ãã
å ã®ä¾ã§è¨ãã°ååãã¹ã¿ããã¯åååºåããã®ä»å ¥ãå ã顧客ãã¹ã¿ãã顧客ã®ä½æã年齢帯ãªã©ãå°ãããã¨æãã¾ãããããããåæè»¸ã«ãªãããé ç®ãã¯å ¨ã¦ãã£ã¡ã³ã·ã§ã³ã§ãã
ãã®èãæ¹èªä½ã¯ç¹æ®µç®æ°ãããã®ã§ã¯ããã¾ããããå人çã«ã¯é常ã«å¼·åãªã³ã³ã»ããã ãªãã¨èãã¦ãã¾ãã
ã¨ããã®ãããããã¹ããã¼ã¿ãã¼ã¹ã®æ§é ãããããªãèªããã¨ãã¦ãã¨ã³ã¸ãã¢ã§ãªãæ¹ã«ã¯ãã³ã¨æ¥ã¥ãã䏿¹ã
- 欲ããã¢ãã¿ãªã³ã°ææ¨ã¯ãªãã§ãã?
- 欲ããåæè»¸ã¯ãªãã§ãã?
ã¨ãã質åã§ããã°ããã¼ã¿ã®å©ç¨è ã«ä¼ãã表ç¾ãã«ãªãããã§ãã
ãæ°ã¥ãã®éããæ¬²ããã¢ãã¿ãªã³ã°ææ¨ã¯ããªãã¡ãæ´åãåªå ãã¹ããã¡ã¯ã表ãã«ã欲ããåæè»¸ã¯ãæ´åãåªå ãã¹ããã£ã¡ã³ã·ã§ã³è¡¨ãã«ç´çµãã¾ããè¨è¨ã¨å®è£ ã«ã¯ããä¸å·¥å¤«å¿ è¦ãªå ´åãããã§ãããããæ¦ãä¸è¨2é ç®ã "å©ç¨è ãå·»ãè¾¼ãã§" æ´çããäºãã使ãããã¼ã¿åºç¤ããä½ãä¸ã§å¿ é ã®ä»äºã¨ç§ã¯èãã¦ãã¾ãã
3. 鿣è¦åãã¼ãã«ãå¹çè¯ãæ´æ°ãã
ãã¼ã¿ãã¼ããæçãã¦ããã¨ããè¨ç®ã³ã¹ãã®é«ããã¼ãã«ãã¨ããã®ãå¿ ãçãã¦ãã¾ãã
BigQuery ã§ããã°ããã©ã¼ãã³ã¹ã®è¦³ç¹ã§æ·±å»ãªåé¡ã«ãªããã¨ã¯å°ãªãã®ã§ããããããããã¼ãã«ã®å ¨ä»¶æ´ãæ¿ããæ¯æ¥è¡ãã®ã¯æµç³ã«ã³ã¹ã観ç¹ã§å¹çãè¯ãããã¾ããã䏿¹ããã¤ã¼ãã«æ¯æ¥è¿½è¨ãã¦ããã ãããªã©ã¨ããã¨ä»åº¦ã¯å¦çãåªçã«ãªãã¾ããã
ããã¨ã»ã¼åä¸ã®èª²é¡ãããã¦ããã«å¯¾ãã解決ç㯠ZOZO Technologies ããã® blog ã§ãç´¹ä»ããã¦ããããã®ææ³ã¯å¼ç¤¾ã®ãã¼ã¿ãã¼ãã§ãåèã«ãã¦ãã¾ãã
ããããç§ãã¡ã®ãã¼ã ã§ã¯ããã«å°ãã¢ã¬ã³ã¸ãå ãã
- ã¹ãã£ã³ãããã¼ã¿éãæãã
- create or replace æ ãç¨ãã¦SQLã ãã§å¦çãå®çµããã
ãªã©ã®å·¥å¤«ãåã£ã¦ããã®ã§ãç´¹ä»ãã¾ãã
åæã¨ãã¦ã以ä¸ã®ãããªéæ£è¦åãã¼ãã«ãä¾ã«èãã¾ãã

åç»ãµã¤ãã®ãããªãµã¼ãã¹ãæ³å®ãã¦ãµã³ãã«ãã¼ã¿ã使ãã¦ãã¾ãã
ã¦ã¼ã¶ã¯ãã£ã³ãã«ã®ç»é²ã¨ãã®ãã£ã³ãã«ã®åç»ãè¦è´ã§ãã¾ãããã®ä¾ã§ä¾ãã°2021-04-01ã«ããã¦
- ããã£ã³ãã«ç»é²ã¢ã¯ã·ã§ã³æ°ã㯠4
- ãåç»è¦è´ã¢ã¯ã·ã§ã³æ°ã㯠3
ã§ããBIãã¼ã«ãªã©ãç¨ãã¦ä¸è¨ãã¼ãã«ãéè¨ãããã¨ã§ãæ¥ãã¨ã®ã¦ã¼ã¶ã®ã¢ã¯ã·ã§ã³ãèªç±åº¦é«ãåæã§ãã¾ãã
åæè»¸ã¨ãã¦ã¯ã¦ã¼ã¶ã®è»¸ããã£ã³ãã«ã®è»¸ãåç»ã®è»¸ãããå¾ã¾ããããããã®ãã¡ã¯ãã®éè¨å¤ãã¤ã¾ããã¡ã¸ã£ã¼ãã¯ãããã¬ãã¢ä¼å¡ãå¦ããããã£ã³ãã«ã®ã«ãã´ãªããªã©ã®è»¸ãã¨ã«åæã§ãã¾ãã
ãããããã¼ãã«ã®æ´æ°ã«éãã¦ãç§ãã¡ã®ãã¼ã ã§ã¯ä»¥ä¸ã®4ã¹ããããåããã¨ãå¤ãã§ãã
i. æ´æ°å¯¾è±¡ã®ããã¯ã¢ããåå¾
ãã㯠bq ã³ãã³ãã§å®æ½ãã¾ãã
bq --project_id "${BIGQUERY_PROJECT}" \ cp -f \ "${BIGQUERY_PROJECT}:${dataset_name}.${table_name}" \ "${BIGQUERY_PROJECT}:TEMP.${dataset_name}_${table_name}"
ããã§ã TEMP ãã¼ã¿ã»ããã¯ããã¯ã¢ããé
ç½®ç¨ã®ãã¼ã¿ã»ããã§ãã
ii. 追å åã®ãã¡ã¯ãåé

ãã£ã³ãã«ç»é²ãåç»è¦è´ãªã©ã®ã¢ã¯ã·ã§ã³ããã¡ã¯ãã¨ãã¦åéãã¾ããselectæã¯çç¥ãã¾ãã
iii. 追å åã¨ãããã¯ã¢ããããã¼ã¸
ãããå¦çã®èã§ã2ã§æ½åºãããã¼ã¿ã¨1ã§åã£ãããã¯ã¢ããã union ãã¾ãã
ã¯ã¨ãªã¯ä»¥ä¸ã®ããã«ãªãã¾ãã
select * , row_number() over( partition by event_time, user_id, channel_id, movie_id order by priority ) as rownum from ( select '1:new' as priority , fact.event_time , fact.user_id , fact.channel_id , fact.register_count , fact.watch_count , fact.movie_id from facts union all select '2:old' as priority , event_time , user_id , channel_id , register_count , watch_count , movie_id from `BQ_PROJECT.DATASET.backup` ) merged
æå¾ã«éè¤ã®æé¤ãå¯è½ãªããã« row_number ãæ¯ã£ã¦ãã¾ãã

ããã§ã backup ãã㯠fact ã®åéã«å¿ è¦ãªã«ã©ã ã ããæ½åºãããã¨ã§ãä¸å³ç°è²é¨åã®ã¹ãã£ã³ãé¿ãã¦ãã¾ãã
ãã¼ã¿ãã¼ãã§ã¯åæè»¸ã¨ãªãé ç®ãå¾ããã©ãã©ãå¢ãã¦ãããã¨ãã¾ã¾ããã¾ããããããã®é ç®ã鿣è¦åãããã¼ãã«ããã¹ãã£ã³ãã¦ãã¾ãã¨
- å ã ã®ãã¹ã¿ãã¼ãã«ã¨æ¯ã¹ã¦å¤§ããªã¹ãã£ã³éãå¿ è¦ã«ãªã
- ãã¹ã¿ã«æ´æ°ããã£ãã¨ãã«åæ ã§ããªã (åæ ããããªãå ´åããããããããªããããããã«ãã "ä¿®æ£" ãã§ããªã)
ãªã©ã®ãã¡ãªãããçãã¾ãã
ãªã®ã§ã select * ã®ãããªã¯ã¨ãªã®è¨è¿°ãé¿ãã¦ã¹ãã£ã³éãçµããå¾è¿°ãããã¹ã¿ã¨ã®çµåãå¾ã§è¡ãã¢ããã¼ããåããã¨ãå¤ãã§ãã
iv. unionãããã®ã¨ãã¹ã¿ãã¼ãã«ç¾¤ãçµåãã
æå¾ã«ãã¹ã¿ãã¼ãã«ã¨ joinããææ°ã¬ã³ã¼ããæ½åºãã¦ãã¼ãã«ã replace ãã¾ã*3ã

æçµçãªã¯ã¨ãªã¯ä»¥ä¸ã®ããã«ãªãã¾ãã
create or replace table datamart_table as select merged.* except (row_num, priority) , u.address as user_address , u.is_premium as user_is_premium , c.category as channel_category , m.title as movie_title from ( select * , row_number() over( partition by event_time, user_id, channel_id, movie_id order by priority ) as rownum from ( select '1:new' as priority , fact.event_time , fact.user_id , fact.channel_id , fact.register_count , fact.watch_count , fact.movie_id from facts union all select '2:old' as priority , event_time , user_id , channel_id , register_count , watch_count , movie_id from `BQ_PROJECT.BACKUP.datamart_foo` ) ) merged join user_master u on merged.user_id = u.id join channel_master c on merged.channel_id = c.id left join movie_master m on merged.movie_id = m.id where rownum = 1
ãã®ã¢ããã¼ãã§ããã°ã
- å¦ç対象ã®ãã¼ã¿ãå¢ãã¦ãã³ã¹ãå¢åãç¾å®çã«æãããã
- éå»åã¾ã§é¡ã£ã¦ãã¼ãã«ãå使ãããã¨ãæ¯è¼ç容æ
ãªã©ã®ã¡ãªãããå¾ããã¾ãã
4. æ°ãã¤ããã¹ãã¢ã³ããã¿ã¼ã³
ãã¼ã¿æ´åããã¦ãã¦å°ãäºä¾ãè¯ãããã¾ããå人çã«çºçããããããã¤å·¥å¤«æ¬¡ç¬¬ã§åé¿å¯è½ã ãªã¨æããã¨ã2ç¹ã»ã©ãç´¹ä»ãã¾ãã
a. ããããã Transformationã
BigQuery ãç»å ´ããåãããã¡ãããã¼ã¿åæã»å å·¥ã¨ããæ¥åã¯ãããæ§ã ãªã¨ã³ã¸ãã¢ãæ§ã ãªå å·¥ãã¼ãã«ãä½ã£ã¦ãããããããã§ãã
ãããããå å·¥ãã¼ãã«ã BigQuery ã«è»¢éãã¦ãæ´ã«ããããã¼ã¿ãã¼ãã«ãããã
...ã¨ããæ§æ³ãè¦æãå²ã¨ããããªå ´æã§èããã¾ãã
ããå人çã«ã¯å¯è½ã§ããã°ãå å·¥ãã¼ãã«ã®äºæ¬¡å å·¥ãï¼ãã¼ã¿ãã¼ãã®ãã¼ã¿ãã¼ãï¼ã¯é¿ããããã«ãã¦ãã¾ããçç±ã¯ä»¥ä¸ã®2ã¤ã§ãã
- ãã¸ãã¯ãèæ¯æ å ±ã失ããããã
- ããã©ã¼ãã³ã¹åé¡ãèµ·ããããã
ãéå»ä½ããã¦ããå å·¥ãã¼ãã«ãã®ãã¸ãã¯ãå°æ¥ã«ããã£ã¦å¼ãç¶ããã¨ã¯é£ãããããã«ããã«å¯¾ããå å·¥ã¯ã¨ãªãæ¸ãã¦ããã¨ãä½ããã¼ã¿ã«åé¡ããã£ãã¨ãã«ãã©ã®ã¬ã¤ã¤ã§ä¸å ·åãèµ·ãã¦ããã®ãããã調æ»ãããã¨ã«ãªãã¾ããããã¯ããªãè¾ã調æ»ã«ãªããã¨ãå¤ãã§ãã

ã¾ãããã§ã«æ¸ããã¦ããå å·¥ãã¸ãã¯ã¯ä½ããã®ãããã§åãã¦ããå ´åãã»ã¨ãã©ã§ããããå¤ãã®å ´åã¯ãã®å¦çå®è¡æå»ãããã«ããã¯ã¨ãªã£ã¦ãã¼ã¿ãã¼ãã®æä¾æå»ãéããããSLOãé«ããããã¨è¨ã£ãè¦æã«å¿ãããã¨ãé£ãããªãã¾ãã
å¯è½ãªéããDBã®ãªã¼ãã¬ããªã«ãªã©ã䏿¬¡æ å ±ã«è¿ããã¼ã¿ãããã¤ãã©ã¤ã³ãæ´åãã¦ãããã¨ããå§ããã¾ãã
b. ããã¡ã¯ãã®ãã¹ãã
åæä¸éè¦ãªé ç®ã§ããã«ããããããããã®ãæå»ããè¨é²ãããªããã¾ãã¯æ´æ°ã«ãã£ã¦å®¹æã«ä¸æ¸ãã§ãã¦ãã¾ãç¶æ ãããééãã課é¡ã ã¨æã£ã¦ãã¾ãã

- ä¼å¡ã®ã¹ãã¼ã¿ã¹ãå¤ãã£ãç¬éã«é¢å¿ããã
- ããæ¡ä»¶ã®é²æãã¹ããããã¨ã«è¿½ããã
ã¿ãããªåæè¦ä»¶ã¯å¾ã ã«ãã¦ããããã§ãããããã«å¯¾ãã¦DBã
- ã¹ãã¼ã¿ã¹ãå¤ãã£ããå ¨é¨Updateãæ´æ°æå»ã¯æã¤ãã©å±¥æ´ã¯æããªã
ã¿ãããªè¨è¨ã§ãã¨ããå¾ããæ¯ãè¿ã£ã¦åæãããã¨ãã«ã§ããªããã¨ããäºæ ã«é¥ãã¾ããã¢ã³ããã¿ã¼ã³ã¨è¨ã£ã¦ãã¾ãã¾ãããããéçºå½åã¯åæææ¨ã¨ãã¦éè¦è¦ãã¦ãªãã£ãããç¶æ³ãå¤ãã£ãããªã©ãããå¾ãã®ã§å®å ¨ã«åé¿ãããã¨ã¯é£ããã§ãã
ããããç¶æ³ã«ç´é¢ããå ´åã¯ã
- ãã¼ãã«ã®æ¥æ¬¡ã¹ãããã·ã§ãããå¥ãã¼ãã«ã¨ãã¦åºç¤è»¢éãã
- æ¬ä½ã·ã¹ãã ãæ¹ä¿®ãã¦å±¥æ´ãæã¤ããã«ãã
- trigger ã§ history ãã¼ãã«ã¸ã® insert å¦çãè¡ã
- ã¤ãã³ãã½ã¼ã·ã³ã°ãç¨ããè¨è¨ã«å¤æ´ãã
ãªã©ã®å¯¾çãåãã¨è¯ãã¨æãã¾ãã
ä»å¾ã®å±æ
以ä¸ãé·ã ã¨æ¸ãã¦ãã¾ãã¾ããããã¨ã ã¹ãªã¼ã®ãã¼ã¿åºç¤ãå®ç§ã¨ããããã§ã¯ãã¡ãããªããã¾ã ã¾ã æ¹åãæ¡å¼µã®ä½å°ãæ®ãã¦ãã¾ãã
ä¾ãã°ããããã¯ãããã¼ã¸ã£ã¼ãæ¥ã ç´é¢ããåæè¦ä»¶ãæºããããã«ããã¼ã¿ãã¼ãã®æ¹ä¿®ãå¿ è¦ãªå ´åãããã¾ããã³ã¼ãã»ã¯ã¨ãªã®å質ãä¿ã¡ãªããè¿ éã«ãã¤ãã©ã¤ã³ãæ´æ°ããããããªæ¥åã®æ ãæã¯æ£ç´è¶³ãã¦ãã¾ããã
ä¾ãã°ããã°ãã¼ã¿ã®åéã¯ä¸é¨ã¬ã¬ã·ã¼ãªä»çµã¿ã«çã¾ã£ã¦ãã¾ããé害çºçæã®å¯¾å¿ãªãã¬ã¼ã·ã§ã³ãè¤éãªã®ã§ãã¹ããªã¼ã å¦çãæ´»ãããå®è£ ã«ä¹ãæããå©ç¨è ã¸ã®å¨ç¥ã¾ã§å«ãã¦é²ãã¦ããããã¨ããã§ãã
ä¾ãã°ããµã¼ãã¹ã横æãããã¼ã¿ã使ã£ã¦ãæ°è¦ãããã¯ããæ½çã«æ´»ãããããããã¾ãããããç¨åº¦åºç¤éç¨ãå®å®ãã¦ãããããããããããã¢ã¤ãã¢ãç¾å®æ§ã®ãããã®ã«ãªãå§ãã¦ãã¾ãã
We are Hiring!!
ã¨ã ã¹ãªã¼ã§ã¯ãããããæ´ãªã課é¡ã«å¯¾ããä¸ç·ã«æãåãããªããæ°ããç¥è¦ãå¾ã¦ãã仲éãåéãã¦ãã¾ãã
èå³ããæã¡ä¸ãã£ãæ¹ã¯ã以ä¸ã®ãªã³ã¯ããã«ã¸ã¥ã¢ã«é¢è«ã»å¿åé ããã¨å¬ããã§ã!
*1:å ã ãªã³ãã¬ãã¹ã«ãã£ã Digdag + Embulk ãµã¼ããã¯ã©ã¦ãã«ç§»è¨ãã話ã¯ãã¡ãã§ãèªã¿é ãã¾ã
*2:ãªããå ã«æããããã°è¨äºã®èªã¿æ¹ã«ãã£ã¦ã¯ããã®æ§æã«ããã¦ãTableauããã·ã¥ãã¼ãããæãã¦ãã¼ã¿ãã¼ãã¨ããè§£éãã¾ãå¯è½ãã¨æãã¾ãããã®ç¹ã«é¢ãã¦ã¯ãã鿣è¦åãã¼ãã«ãç¨æãã¦ããããè¨ç®å¯è½ãªã«ã©ã ã¯ãã¼ãã«å ã«ç¨æãã¦ããããªã©ã®å·¥å¤«ã§ Tableau ã¯ã¼ã¯ããã¯å ã®ãã¸ãã¯ã極å "èã" ãªãããã«ãã¦ãã¾ãããã®æ¹ãåæè ãæ¬æ¥ã®åæã«æ³¨åã§ãã¾ãããã¯ã¼ã¯ããã¯å ã®è¨ç®ãã£ã¼ã«ãå®ç¾©ãªã©ã¯ãã¤ããªãã¼ã¿ã«ãªããããã¼ã¸ã§ã³ç®¡çãã¨ã¦ãé£ããããã§ã
*3:å¿ ããã "ææ°" ã§ã¯ãªãããã¡ã¯ããçºçããæ¥ã»çºçããæã®å±æ§æ å ±ãåã£ãæ¹ãè¯ãå ´åãããã¾ãããæã ã®ãã¼ã ã§ããããã¦ããã±ã¼ã¹ãããã¾ãããã¹ã¿ãã¼ã¿("Reference & Master Data")ã®ç®¡çãDMBOKã®1é ç®ã«å«ã¾ãããããªå¥¥æ·±ããããã¯ã§ãããæ¬ç¨¿ã§ã¯çç¥ãã¾ããã