ãã®è¨äºã¯ Timee Advent Calendar 2024 ã·ãªã¼ãº 1 ã®5æ¥ç®ã®è¨äºã§ãã ã¯ããã« ããã«ã¡ã¯ãã¿ã¤ãã¼ã® DRE ãã¼ã ã® chanyou ã§ãã2024å¹´ã®3æã« DRE ãã¼ã ã«ã¸ã§ã¤ã³ãã¦ã社å ã®ãã¼ã¿åºç¤ãä½ã£ã¦éç¨ãã¦ãã¾ãã DuckDB ã使ã£ã¦ãã¼ã¿åºç¤ã§æ±ããã¼ã¿ã®å質ãä¿è¨¼ãå§ããã®ã§ããã®å 容ããç´¹ä»ãã¾ãã ãã¼ã¿å質ã¨å®å ¨æ§ ã¿ã¤ãã¼ã®ãã¼ã¿åºç¤ã§éè¦ãã¦ãããã¼ã¿å質 ã¿ã¤ãã¼ã§ã¯ãDMBOK ãåèã«ä»¥ä¸ã®ãã¼ã¿å質ãéè¦ãã¦è¨è¨ãæ¥ã ã®éç¨ãè¡ã£ã¦ãã¾ãã ç¹æ§ æå³ å®å ¨æ§ ãã¼ã¿ãæ¬ æãã¦ããªãã é©ææ§ å¿ è¦ãªã¨ãã«ããã«ãã¼ã¿ãåç §ã§ããã ä¸ææ§ ãã¼ã¿ãéè¤ãã¦ããªãã ä¸è²«æ§ åã»ã¿ã¤ã ã¾ã¼ã³ã»è¡¨è¨æºããªã©ãå¤ã®æ¸å¼ãæå³ãçµ±ä¸ããã¦ããã ä»åã¯å®å ¨æ§ã«ãã©ã¼ã«ã¹ãã¾ãã å®å ¨æ§ãæãªãããã¿ã¤ãã³ã° ä¸è¨ã®éã
ã¯ããã« Python ãç¨ãã¦ãã¼ã¿åæãè¡ãã«ããããã使ãããã©ã¤ãã©ãªã¨ã㦠pandas ãããã¾ãã pandas ã¯å¤§å¤ä½¿ãåæã®è¯ãã©ã¤ãã©ãªã§ãããå¤ãã®å ´åãã¼ã¿ã丸ã㨠pd.DataFrame åã§ä¿æãããããã©ã®ãããªåãæã£ã¦ããã®ããããååãã©ã®ãããªåããããååã®å¤ã«ã©ã®ãããªå¤ãå ¥ãããã®ããçãã½ã¼ã¹ã³ã¼ããä¸è¦ããã ãã§ã¯åãããªããã¨ãå¤ãã§ãã çµæã¨ãã¦å¦çããã©ãã¯ããã¯ã¹åãã¦ãã¾ãããããã°ã³ã¹ãã®å¢å ãã³ã¼ãã®å¯èªæ§ä½ä¸ã¨ãã£ãåé¡ãçãããããã¨ãããã¾ãã ãã®åé¡ã¸ã®è§£æ±ºçã®ä¸ã¤ã¨ãã¦ãæ¬è¨äºã§ã¯ãã¼ã¿ãã¬ã¼ã ã®ããªãã¼ã·ã§ã³æ©è½ãæä¾ããã©ã¤ãã©ãªã§ãã pandera ãç´¹ä»ãã¾ãã pandera ã¨ã¯ ãã¼ã¿å¦çãã¤ãã©ã¤ã³ã®å¯èªæ§ã¨ããã¹ãããé«ããããã« dataframe ã«å¯¾ãã¦ãã¼ã¿æ¤è¨¼ãè¡ãæ©è½ãæä¾ããã©
ãã®è¨äºã«ã¤ã㦠Pythonã®ããªãã¼ã¿ã¼ï¼Pydanticï¼ããPythonã®å¼æ°ã®ãã¼ãµã¼ã¨ãã¦ä½¿ãæ¹æ³ãç´¹ä»ãã¾ã ä½ãå¬ããã®ï¼ Pydantic以å¤ã®OSSã©ã¤ãã©ãªã¯ä¸è¦ã§ã Pythonãã¡ã¤ã«ã«æ¸¡ããã弿°ãæ¤è¨¼ãå夿ããããã¨ãã§ãã¾ã Pydanticã§å®ç¾©ãæ¸ãã ãã§ãããããArgumentParserãããæ¥½ã§ã IDEã®è£å®ãå¹ãããã«ãªãã¾ã æ¹æ³ BaseModelãç¶æ¿ããã¯ã©ã¹ã«ã以ä¸ã®é¢æ°ãæ¸ãå ãã¾ã @classmethod def parse_args(cls): parser = ArgumentParser() for k in cls.schema()["properties"].keys(): parser.add_argument(f"-{k[0:1]}", f"--{k}") return cls.parse_obj(parser
ã¹ãã¼ãã«åºã¥ã Validation ã®æ©æ¢°å¦ç¿ã«ç¹æãªèª²é¡ ã¹ãã¼ãã«åºã¥ã Validation ã®ãã¡ãæ©æ¢°å¦ç¿ã«ç¹æã®åé¡ãããã§ã¯3ã¤åãä¸ãã¾ãã æ©æ¢°å¦ç¿ã¢ãã«ã®ç¹å¾´éã¯è«å¤§ã«ãªããã¨ããããæåã§ãã¹ã¦ãæ¸ãåºãã®ã¯ç¾å®çã§ãªã ã¹ãã¼ããæºãããªããã¨ã¯ãå¿ ãããæªããã¨ã§ã¯ãªã æ©æ¢°å¦ç¿ã¢ãã«ã«å ¥åãããã®ã¯ãæ§é åãã¼ã¿ã ãã§ã¯ãªã æ©æ¢°å¦ç¿ã¢ãã«ã®ç¹å¾´éã¯è«å¤§ã«ãªããã¨ããããæåã§ãã¹ã¦ãæ¸ãåºãã®ã¯ç¾å®çã§ã¯ãªã é常㮠Web ãµã¼ãã¹ã¨ã¯ç°ãªããæ©æ¢°å¦ç¿ã§ã¯ãã¼ã¿ãµã¤ã¨ã³ãã£ã¹ããè¤é㪠SQL ãæ¸ãã¦è¨å¤§ãªç¹å¾´éãä½ãåºããã¨ãããã¾ãã ãã¨ãã°ãéå»ã® 1 ã¶æã®è¡åå±¥æ´ããè³¼å ¥äºæ¸¬ãè¡ãå ´åãéå»ã«è¡ã£ãè¡å (ãã°ã¤ã³ãªã©) ã®åæ°ãæ¥æ¯ã«éè¨ããç¹å¾´éã¨ããã®ã¯ä¸è¬çã§ããå ·ä½çã«ã¯ãã¦ã¼ã¶ã¼ã®ãã°ã¤ã³åæ°ãè³¼å ¥åæ°ãæ¥æ¯ã«éè¨ãã¦æ©æ¢°å¦ç¿ã¢
CloudDQ is a cloud-native, declarative, and scalable Data Quality validation Command-Line Interface (CLI) application for Google BigQuery. CloudDQ allows users to define and schedule custom Data Quality checks across their BigQuery tables. Data Quality validation results will be available in another BigQuery table of their choice. Users can then build dashboards or consume data quality outputs pro
Built for todayâs data complexityModern data systems are powerfulâbut fragile. Data breaks, pipelines drift, and nobody wants to be the last to know. GXÂ gives your team tools to: Validate critical data across your pipelines Share a common language for data quality Build trust across technical and business teams What you need, when you need itSet up in minutes. Scale with confidence. GXÂ is a plat
ããã«ã¡ã¯ãECãã©ãããã©ã¼ã é¨ãã¼ã¿ã¨ã³ã¸ãã¢ã®é è¤ã§ããç¾å¨ãç§ã¯æ¨è¦åºç¤ãã¼ã ã«æå±ãã¦ããã¼ã¿éè¨åºç¤ã®éç¨ãDMPã»åºåã¾ããã®ãã¼ã¿ã¨ã³ã¸ãã¢ãªã³ã°ãªã©ã«å¾äºãã¦ãã¾ãã 以åãç§ãã¡ã®ãã¼ã ã§ã¯ã¯ã¨ãªç®¡çã«Lookerãå°å ¥ãããã¨ã§ããã¼ã¿ã¬ããã³ã¹ãå¹ããããã¼ã¿éè¨åºç¤ãå®ç¾ãã¾ããã詳細ã¯ã以åç´¹ä»ãããã¼ã¿éè¨åºç¤ã«ã¤ãã¦ã¯ä»¥ä¸ã®éå»è¨äºãã覧ãã ããã techblog.zozo.com æ¬è¨äºã§ã¯ããã¼ã¿éè¨åºç¤ã«ããã¼ã¿ããªãã¼ã·ã§ã³ãã®æ©è½ãå ãã¦å¸¸ã«æ£ç¢ºãªãã¼ã¿éè¨ãè¡ããããã«æ¹è¯ããææ®µããä¼ããã¾ãã ãã¼ã¿ããªãã¼ã·ã§ã³ã¨ã¯ ããªãã¼ã·ã§ã³å°å ¥å¾ã®ãã¼ã¿éè¨åºç¤ ã¸ã§ããããæ§ç¯ ãã³ãã¬ã¼ãã«ããå¹ççãªDAGã®ä½æ DAGéã®ä¾åé¢ä¿ã®è¨å®æ¹æ³ ããªãã¼ã·ã§ã³DAGã®ã¿ã¹ã¯æ§æ ã¾ã¨ã ãã¼ã¿ããªãã¼ã·ã§ã³ã¨ã¯ ãã¼ã¿ããªãã¼ã·ã§ã³ã¨ã¯ãã¼ã¿
ã©ã³ãã³ã°
ã©ã³ãã³ã°
ãç¥ãã
ãªãªã¼ã¹ãé害æ å ±ãªã©ã®ãµã¼ãã¹ã®ãç¥ãã
ææ°ã®äººæ°ã¨ã³ããªã¼ã®é ä¿¡
å¦çãå®è¡ä¸ã§ã
j次ã®ããã¯ãã¼ã¯
kåã®ããã¯ãã¼ã¯
lãã¨ã§èªã
eã³ã¡ã³ãä¸è¦§ãéã
oãã¼ã¸ãéã
{{#tags}}- {{label}}
{{/tags}}