
ã¿ãªããããã«ã¡ã¯ãããã±ãï¼@takapy0210ï¼ã§ãã
ä»åã¯ãã³ããããæ¯ãããã¼ã¿åºç¤ã®å¾ç·¨ã¨ãããã¨ã§ãDataformã®éçºç°å¢ã«ã¤ãã¦æããã¨æãã¾ãç´¹ä»ãããã¨æãã¾ãã
ãã¯ã¤ã®ã¨ããã¯ãããªæãã®ç°å¢ã§éçºãã¦ãããï¼ããªã©ããã°ã³ã¡ã³ãããã ããã¨å¤§å¤å¬ããã§ãï¼
ååã®è¨äºã¯ãã¡ã
ç®æ¬¡
- ã¯ããã«
- éçºç°å¢ã®èª²é¡ã¨è§£æ±ºç
- Dataform toolsæ¡å¼µæ©è½ã®æ´»ç¨
- æå¾ã«
- We Are Hiring ð¤
ã¯ããã«
ååã®è¨äºã§ã¯ãã³ããããæ±ãã¦ããèª²é¡æããããè§£æ¶ããããã«Dataformã§ã©ã®ããã«ãã¼ã¿åºç¤ãæ§ç¯ãã¦ããã®ãï¼ã«ã¤ãã¦ç´¹ä»ãã¾ããã
æ¬è¨äºã§ã¯ã以ä¸ã®ãããã¯ã«ç¦ç¹ãå½ã¦ã¦ç´¹ä»ãã¾ãã
- Visual Studio Codeï¼VS Codeï¼ with Dev Containersã«ããéçºç°å¢
- Dataform toolsæ¡å¼µæ©è½ã®æ´»ç¨
Dataform toolsã«é¢ãã¦ã¯ãGoogle Cloudä¸ã§è¡ãããã¨ã¯æ®ã©ç¶²ç¾ ãã¦ããããSQLFluff ã«ãããã©ã¼ãããã使ãããã¨ãéçºä½é¨ã¯ããªãè¯ããªã£ãã¨å®æãã¦ãã¾ãï¼
éçºç°å¢ã®èª²é¡ã¨è§£æ±ºç
Dataformã¯Google Cloudã®ã³ã³ã½ã¼ã«ä¸ã§éçºãè¡ããã¨ãã§ãã¾ãã Githubã¨é£æºãããã¨ã§ãã¯ã¼ã¯ã¹ãã¼ã¹ã¨ããããããããã©ã³ãã®ãããªãã®ãåã£ã¦éçºããPRãåºããã¨ãã£ãããã¼ãè¸ããã¨ãã§ãã¾ãã
ã¡ãªããã¨ãã¦ã¯ãç¹ã«éçºç°å¢ãªã©ãæ´åããªãã¨ãSQLããç¥ã£ã¦ããã°èª°ã§ãéçºããããç¹ãæããããã¨æãã¾ãã
ããããéç¨ãã¦ããã¨ä»¥ä¸ã®ãããªèª²é¡ãã§ã¦ãã¾ããã
- ã¨ãã£ã¿ã使ãã¥ãã
- ã¿ãã®å·¦å³åå²è¡¨ç¤ºãã§ããªã
- ãã©ã¼ããã¿ã¼ã®èªåãã©ã¼ããããããã¦ãªã
- ãµã¸ã§ã¹ãæ©è½ãè²§å¼±ããªã©
- ã³ã¼ãã¬ãã¥ã¼æãæ¯åã³ã³ã½ã¼ã«ã«ã¢ã¯ã»ã¹ã対象ã®ã¯ã¼ã¯ã¹ãã¼ã¹ãéããSQLã®ã³ã³ãã¤ã«ãªã©ãåé¡ãªãã確èªããå¿ è¦ããã
- Claude Codeãªã©ãCoding Agentã®æ©æµãåããããªã
ç¹ã«ããããã®æä»£ãCoding Agentã®æ©æµãåããããªããã¨ã¯éè¦ãªèª²é¡ã¨å¤æãããã¼ã«ã«ã§éçºã§ããç°å¢ãæ§ç¯ãã¦ããã¾ããã
VS Code with Dev Containersã«ããéçºç°å¢
ä¸è¨ã®èª²é¡ã解決ããããã«ãVS Code with Dev Containersã§éçºç°å¢ãæ´åãã¾ããã ã¡ãªããã¨ãã¦ã¯ããã£ãã以ä¸ã®ãããªãã¨ãæããããã¨æãã¾ãã
- ãã¼ã ã¡ã³ãã¼å ¨å¡ãåãç°å¢ã§éçºã§ãã
- Dockerã¤ã¡ã¼ã¸ã«ãã¼ã«ãããªã¤ã³ã¹ãã¼ã«æ¸ã¿ãªã®ã§ãã»ããã¢ããã容æ
- ã³ã³ãã¤ã«ãã§ãã¯ããã¼ã¿ãªãã¼ã¸ãªã©ãVS Codeã®æ¡å¼µæ©è½ã使ãã
Dockerfileã®æ§æ
Dev Containerã§å©ç¨ãã¦ãã dockerfile ã¯ä»¥ä¸ã®ãããªã¤ã¡ã¼ã¸ã§ãã
FROM mcr.microsoft.com/devcontainers/javascript-node:20-bookworm
RUN apt-get update && apt-get install -y --no-install-recommends \
...
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
# Google Cloud CLI ã®ã¤ã³ã¹ãã¼ã«
RUN curl -fsSL https://packages.cloud.google.com/apt/doc/apt-key.gpg \
| gpg --dearmor -o /usr/share/keyrings/cloud.google.gpg \
&& echo "deb [signed-by=/usr/share/keyrings/cloud.google.gpg] https://packages.cloud.google.com/apt cloud-sdk main" \
> /etc/apt/sources.list.d/google-cloud-sdk.list \
&& apt-get update \
&& apt-get install -y --no-install-recommends \
google-cloud-cli=548.0.0-0 \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
# Dataform CLI ãã¤ã³ã¹ãã¼ã«
RUN npm install -g @dataform/[email protected]
# SQLFluff ã®ã¤ã³ã¹ãã¼ã«
RUN python3 -m pip install sqlfluff==4.0.4
devcontainer.jsonã®è¨å®
devcontainer.json ã¯ä»¥ä¸ã§ãã
VS Codeã®æ¡å¼µæ©è½ã§ããDataform toolsãã¤ã³ã¹ãã¼ã«ãã¦ä½¿ããããã«ãã¦ãã¾ãã
{ "name": "Dataform Dev Environment", "build": { "dockerfile": "Dockerfile" }, "mounts": [ ... ], // VS Codeã®è¨å® "customizations": { "vscode": { ... "extensions": [ "ashishalex.dataform-lsp-vscode" ... ] } }, ... }
Dev Containerèµ·åå¾ã以ä¸ã®æé ã§Google Cloudèªè¨¼ãè¡ãã°ãéçºç°å¢ãæ´ãã¾ãã
# Google Cloudèªè¨¼ gcloud auth application-default login gcloud auth login # ããã¸ã§ã¯ãã®è¨å® gcloud config set project [PROJECT NAME]
Dataform toolsæ¡å¼µæ©è½ã®æ´»ç¨
Dataform toolsã¨ã¯ãVS Codeåãã®Dataformæ¡å¼µæ©è½ã§ãã以ä¸ã®ãããªæ©è½ãæä¾ãã¦ãã¾ãã
- ã³ã³ãã¤ã«å¾ã®ã¯ã¨ãªã¨ Dry Runã®çµ±è¨æ å ±ã®è¡¨ç¤º
- ä¾åé¢ä¿ã°ã©ãï¼Data Lineageï¼ã®è¡¨ç¤º
- ã¯ã¨ãªçµæã®ãã¬ãã¥ã¼è¡¨ç¤º
- CLI ã¾ã㯠Dataform APIãç¨ããã¸ã§ãã®å®è¡
- SQLFluff ã使ç¨ã㦠.sqlx ãã¡ã¤ã«ã®ãã©ã¼ããã
詳細ã¯ä»¥ä¸ã®ãã¼ã¸ãã覧ããã ããã°ã¨æãã¾ãããä¸ã§ãã³ã³ãã¤ã«ãDry Runã®çµ±è¨æ å ±ããµã¯ãã¨ç¢ºèªã§ããããSQLFluff ãå©ç¨ãã¦ãã©ã¼ãããã§ããç¹ã«å©ãããã¦ãã¾ãã
ã³ã³ãã¤ã«å¾ã®ã¯ã¨ãªã¨ Dry Runã®çµ±è¨æ å ±ã®è¡¨ç¤º
ã¢ã¶ã¤ã¯å¤ãã§ã¯ããã¾ãããVS Codeä¸ã§ã¯ä»¥ä¸ã®ããã«æ å ±ãè¦ããã¨ãã§ãã¾ãã
ã³ã³ãã¤ã«å¾ã®ã¯ã¨ãªã¯ãã¡ãããIncremental / Non incrementalã¢ã¼ãã§ãã®sqlxãå®è¡ããæã«ãã©ã®ãããã®ã³ã¹ãããããã®ãï¼ãç®åºãã¦ããã¦ãã¾ãã


å ¨sqlxã«å¯¾ãã¦ãã³ã³ãã¤ã«ãè¡ãããã®ãã§ãã¯ãå¯è½ã§ãã
# å ¨ã¦ã®ã¹ãã¼ããã³ã³ãã¤ã«ãã§ã㯠$ dataform compile
åºåä¾
Compiled xxx action(s). xxx dataset(s): hoge_table [table] fuga_table.bq_daily_scan_cost [incremental] ... xxx assertion(s): hoge ... xxx operation(s): fuga ...
ã¾ããdry-runã«é¢ãã¦ãã¾ã¨ãã¦å®è¡ãããã¨ãã§ãã¾ãã
# å ¨ã¹ãã¼ããdry-run dataform run --dry-run # ã¿ã°ãæå®ãã¦dry-run dataform run --tags [tag_name] --dry-run # ç¹å®ã®ã¹ãã¼ãã¨ãã®ä¾åé¢ä¿ãå«ãã¦dry-run dataform run --actions "hoge_table" --include-deps --dry-run
ä¾åé¢ä¿ã°ã©ãï¼Data Lineageï¼ã®è¡¨ç¤º
以ä¸ã®ããã«ãä¾åé¢ä¿ã°ã©ãï¼Data Lineageï¼ã確èªãããã¨ãã§ãã¾ãã
対象ãã¼ãã«ãBigQueryã®ã³ã³ã½ã¼ã«ä¸ã§ã·ã¥ãã¨ç¢ºèªã§ããããã«ããã¼ãã«URLã¸ã®é·ç§»ãã¿ã³ãä»ãã¦ãã¾ãã

ã¯ã¨ãªçµæã®ãã¬ãã¥ã¼è¡¨ç¤º
Preview Dataãã¿ã³ãæ¼ä¸ãããã¨ã§ãsqlxãå®è¡ããã¨ãã«åå¾ããããã¼ã¿ã®ä¸èº«ãè¦ããã¨ãã§ãã¾ãã
ãã£ã«ã¿ã¼ãªã©ãä»ãã¦ããã®ã§ããã¼ã¿ç¢ºèªãæãã¾ãã

CLI ã¾ã㯠Dataform APIãç¨ããã¸ã§ãã®å®è¡
Google Cloudã³ã³ã½ã¼ã«ã«è¡ããã¨ãVS Codeä¸ããã¸ã§ãã®å®è¡ãè¡ããã¨ãã§ãã¾ãã
ãRunï¼CLIï¼ãã¨ãRunï¼APIï¼ãã®2ã¤ã®æ¹æ³ãããã®ã§ãããããããã§æåãéãã®ã§æ³¨æãå¿ è¦ã§ãã

Runï¼CLIï¼ã§ã¯ãç¾å¨ãã¼ã«ã«ã§ç·¨éãã¦ãããã¡ã¤ã«ããã®ã¾ã¾å®è¡ãããã®ã«å¯¾ãã¦ãRunï¼APIï¼ã§ã¯ãç¾å¨ã®ãã©ã³ããDataformä¸ã§ã³ã³ãã¤ã«ãã¦å®è¡ãã¾ãããã®ãããã¼ã«ã«ã§ç·¨éãããã¡ã¤ã«ãPushãã¦ããªãç¶æ ã§Runï¼APIï¼ãæ¼ä¸ããã¨ãææ°ã®å¤æ´ãåæ ãããªãç¶æ ã§å®è¡ããã¦ãã¾ãã¾ãã
Runï¼CLIï¼
- ç¾å¨ãã¼ã«ã«ã§ç·¨éãã¦ãããã¡ã¤ã«ããã®ã¾ã¾å®è¡ãããã
- BigQueryã®ã¸ã§ãã¨ã¯ã¹ããã¼ã©ã¼ãããå®è¡ãããJOBã確èªã§ããã
- ãã ããJOBãåããã¦å®è¡ããããã¨ãããã®ã§æ¢ãã®ã¯ãã大å¤ã
- æåã¨ãã¦ã¯
/usr/local/share/npm-global/bin/dataform run "/workspaces/gcp-dataform" --timeout=5m --actions "connehito-dwh.dataset.table"ã®ãããªã³ãã³ããå®è¡ãããã¨ã®åãã
Runï¼APIï¼
- ç¾å¨ã®ãã©ã³ããDataformä¸ã§ã³ã³ãã¤ã«ããå®è¡ãããã
- Google Cloudä¸ã«ããDataformã®ã
Workflow Execution Logsããå®è¡ãã°ãè¦ããã¨ãã§ããã
SQLFluff ã使ç¨ã㦠.sqlx ãã¡ã¤ã«ã®ãã©ã¼ããã
ãã¤ã®éã«ãå®è£ ããã¦ããã®ã§ãããSQLFluff ã使ã£ã¦ãã©ã¼ããããã§ãã¾ãï¼ï¼æåï¼
ããã¾ã§ã¯ä»¥ä¸ã®ãããªãã©ã°ã¤ã³ã使ã£ã¦ããæ¹ãå¤ãã£ãã®ã§ã¯ãªãã§ããããï¼
VS Codeä¸ã§ã¯ä»¥ä¸ã®3ã¶æãããã©ã¼ããããå®è¡ãããã¨ãã§ãã¾ãã

SQLFluffã«ãããã©ã¼ããããæå¹ã«ããã«ã¯ .vscode-dataform-tools/.sqlfluff ãã¡ã¤ã«ãå¿
è¦ã«ãªãã®ã§ããããã®å
容ã«ä»¥ä¸ã®ãã¬ã¼ã¹ãã«ãã¼ã追è¨ããªãã¨ã¨ã©ã¼ã«ãªããããæ³¨æãå¿
è¦ã§ãã
ï¼ã¨ã©ã¼ã®çç±ã§ãããDataform toolsæ¡å¼µæ©è½ã䏿ãã¡ã¤ã«ã使ããéã«ã${ref()}ãªã©ãæ°åï¼1,2,3ãªã©ï¼ã«ç½®ãæãã¦ããããã®çµæãSQLã¨ãã¦ä¸æ£ã«ãªã£ã¦ããããã«çºçãã¦ããããã§ãï¼
...
[sqlfluff:templater:placeholder]
# Dataformã®${...}æ§æããã¬ã¼ã¹ãã«ãã¼ã¨ãã¦æ±ãï¼ãã¹ããã{}ã2ã¬ãã«ã¾ã§ãµãã¼ãï¼
param_regex = \$\{(?:[^{}]|\{(?:[^{}]|\{[^{}]*\})*\})*\}
# Dataform toolsæ¡å¼µæ©è½ã使ç¨ããæ°åãã¬ã¼ã¹ãã«ãã¼ãå®ç¾©
# æ¡å¼µæ©è½ã¯${ref()}ãªã©ãæ°åã«ç½®ãæããããããããããã¼ãã«åã¨ãã¦èªèããã
1 = placeholder_table_1
2 = placeholder_table_2
3 = placeholder_table_3
4 = placeholder_table_4
5 = placeholder_table_5
6 = placeholder_table_6
7 = placeholder_table_7
8 = placeholder_table_8
9 = placeholder_table_9
10 = placeholder_table_10
æå¾ã«
æ¬è¨äºã§ã¯ãDataformã®éçºç°å¢ãGoogle Cloudä¸ã®ããã¸ã¡ã³ãã³ã³ã½ã¼ã«ããããã¼ã«ã«ã®VS Code with Dev Containersç°å¢ã«ç§»è¡ããå¿«é©ã«éçºãè¡ããããã«ãªã£ããã¨ã«ã¤ãã¦ç´¹ä»ãã¾ããã
Dataformã®äºä¾ã¯ã¾ã ã¾ã å°ãªãã¨æãã¾ãã®ã§ãä»å¾ã宿çã«çºä¿¡ãã¦ããããã¨æãã¾ãï¼
We Are Hiring ð¤
ã³ãããã§ã¯ãã¼ã¿ãç¨ãã¦ãããã¯ãã»ä¼ç¤¾ãæé·ãããæ©æ¢°å¦ç¿ã¨ã³ã¸ãã¢ãåéãã¦ãã¾ãï¼
èå³ã®ããæ¹ã¯ä»¥ä¸ãããé£çµ¡ãå¾ ã¡ãã¦ããã¾ãï¼
ã³ãããã«ãããæ©æ¢°å¦ç¿ããã¼ã¿å¨è¾ºæ¥åã«é¢ãã¦ã¯ä»¥ä¸ã®è¨äºã§ç´¹ä»ãã¦ãã¾ãã®ã§ãåããã¦ã覧ãã ããï¼













