ãã®è¨äºã¯Â Akatsuki Advent Calendar 2019 - Adventar 4æ¥ç®ã®è¨äºã§ãã
ããã«ã¡ã¯ï¼ suwahime ã§ãã
æ¨ä»ã©ã®æ¥çãè¦æ¸¡ãã¦ããã¢ã¯ãã£ãã¦ã¼ã¶ã¼æ°ãå ¥ä¼æ°ã¨ãã£ããã¼ã¿ã¯ãå½ããåã®ããã«æ¥ã 追ã£ã¦ãããã¨æãã¾ãã ç§ã®æå±ãããã¼ã ã§ã¯ã主ã«Redashãç¨ãã¦KPIãå¯è¦åãã¦ãã¾ãã
ä»æ¥ã¯ãå æBigQueryã«Betaãªãªã¼ã¹ãããã¹ã¯ãªããã¨ããã¼ã¸ã£æ©è½ã使ã£ã¦ãRedashã®Queryä½æããããã·ã³ãã«ã«ã管çããããå½¢ã§æ¸ãã¦ã¿ããã¨ãã話ãããã¦ããã ãã¾ãã
Redashã¯Queryãã¨ã«å®ç¾©ãåæ£ãã¦ãã¾ããã¡
ä¾ã¨ãã¦ããæéå ã«ãµã¼ãã¹ã訪ããã¦ã¼ã¶ã¼ãã®å©ç¨ããã¤ã¹ã調ã¹ãã以ä¸ã®ãããªQueryãRedashã«ä½æãã¦ããã¨ãã¾ãã
SELECT B.device_name, COUNT(DISTINCT A.user_id) FROM (SELECT * FROM dataset.login_user_ids WHERE timestamp >= TIMESTAMP("{{start_time}}") AND timestamp <= TIMESTAMP("{{end_time}}") GROUP BY user_id) AS A LEFT JOIN (SELECT user_id, device_name FROM dataset.user_devices) AS B ON A.user_id = B.user_id GROUP BY B.device_name
ãã¼ãã«ã®å 容çã¯å²æãã¾ãããå©ç¨è ã¯start_timeã¨end_timeãRedashä¸ã®GUIããå ¥åãããã¨ã§ãæ°åãè¦ãããã¨ããã¤ã¡ã¼ã¸ã§ãã
Redashã¯ãã®ããã«æç´ãªSQLãæ¸ãã¦ãç°¡åã«ã°ã©ãåã¾ã§ã§ããéæ³ã®ãããªãã¼ã«ãªã®ã§ããããããããç°¡åããã¦ãèããªãã«æ§ã ãªQueryãä½æãã¦ãã¾ããå¾ã ã®å®ç¾©å¤æ´ã大å¤ã ã£ãããããã¨ãããã¾ãã
ãã¨ãã°ã以ä¸ã®ãããªè¦æ±ãããå ´åã¯ã©ããªãã§ããããã
- æéãã¨ã®ä¸æ£ãªã¦ã¼ã¶ã¼IDãæ¤ç¥ãããã¨ãã§ããã®ã§ããã®ä¸è¦§ãKPIããåæ¸ãããã
- ãã°ã¤ã³ããã¦ã¼ã¶ã¼ã§ã¯ãªãããµã¼ãã¹ããã¬ã¤ããã¦ã¼ã¶ã¼ããåãããã«å¤ãã¦ã»ããã
ä¸éãã¼ãã«ãä½æãããªã©ãããæ¹ã¯ããããããã¨æãã¾ãããããã§ã¯ã¾ãæç´ã«ä»¥ä¸ã®ããã«ä¿®æ£ãã¦ã¿ã¾ãã
SELECT B.device_name, COUNT(DISTINCT A.user_id) FROM (SELECT * FROM dataset.play_user_ids -- åç §ãããã¼ãã«ãç½®ãæã WHERE timestamp >= TIMESTAMP("{{start_time}}") AND timestamp <= TIMESTAMP("{{end_time}}") AND user_id NOT IN -- ä¸æ£ã¦ã¼ã¶ã¼IDãå¼¾ã (SELECT user_id FROM dataset.wrong_user_ids WHERE timestamp >= TIMESTAMP("{{start_time}}") AND timestamp <= TIMESTAMP("{{end_time}}")) GROUP BY user_id) AS A LEFT JOIN (SELECT user_id, device_name FROM dataset.user_devices) AS B ON A.user_id = B.user_id GROUP BY B.device_name
Redashã®åãè¾¼ã¿è¨æ³ã§ãã {{}} ãå¢ãã¦å°ã è¦ã¥ãããªãã¾ãããããªãã¨ãç°¡åã«ã§ãã¾ãããRedashã¯ãã®ãããã¯ã¼ã¯ã®è»½ããè¯ãã§ããã
ã§ã¯ãæ´ã«æ¬¡ã®ãããªè¦æ±ãæ¥ãå ´åã¯ã©ãã§ãããï¼
- ãæéå ã«ãµã¼ãã¹ã訪ããã¦ã¼ã¶ã¼ããå ã«èª¿æ»ãã¦ããä»ã®å ¨ã¦ã®Queryã«ã¤ãã¦ããåæ§ã«ç½®ãæãã¦æ¬²ããã
â¦ããã¯å°ã å°ãã¾ãããQueryæ°ãå°ãªãããã¸ã§ã¯ããªã©ã¯ããããªã«ã³ã¹ããããããç½®ãæãå¯è½ãªã®ããããã¾ããããããã誰ã§ã好ãåæã«Queryä½æãå¯è½ãªããã¸ã§ã¯ãã«ããã¦ã¯ãã©ãã§ãã®ææ¨ãå©ç¨ãã¦ããã®ããæ¤ç´¢ãã¦è¦ã¦ããããããã¾ããã
ãããæéå ã«ãµã¼ãã¹ã訪ããã¦ã¼ã¶ã¼ãã®å®ç¾©ãä¸å 管çã§ãã¦ãããããããªæéã¯ç¡ããªãã¨æãã¾ãããï¼
Redashã¨BigQueryã¹ã¯ãªãããçµã¿åããã¦ä½¿ã£ã¦ã¿ã
ã§ã¯ããã§ãBigQueryã¹ã¯ãªããã使ã£ã解決çã試ãã¦ã¿ã¾ãããã
ã¾ãã¯ãæéå ã«ãµã¼ãã¹ã訪ããã¦ã¼ã¶ã¼ããaccess_user_idsã¨ããTEMP TABLEã«åãåºãããã·ã¼ã¸ã£ãã以ä¸ã®ã¹ã¯ãªãããBigQueryä¸ã§å®è¡ãããã¨ã§ç»é²ãã¦ã¿ã¾ãã
CREATE PROCEDURE dataset.create_access_user_ids (start_date TIMESTAMP, end_date TIMESTAMP) BEGIN CREATE TEMP TABLE access_user_ids AS SELECT * FROM dataset.play_user_ids WHERE timestamp >= start_date AND timestamp <= end_date AND user_id NOT IN (SELECT user_id FROM dataset.wrong_user_ids WHERE timestamp >= start_date AND timestamp <= end_date) GROUP BY user_id; END;
ãããCALLé¢æ°ã§å¼ã³åºããã¨ã§ãRedashã®Queryã¯ä»¥ä¸ã®ããã«æ¸ããããã«ãªãã¾ãã
CALL dataset.create_access_user_ids(TIMESTAMP("{{start_time}}"), TIMESTAMP("{{end_time}}")); SELECT B.device_name, COUNT(DISTINCT access_user_ids.user_id) FROM access_user_ids LEFT JOIN (SELECT user_id, device_name FROM dataset.user_devices) AS B ON access_user_ids.user_id = B.user_id GROUP BY B.device_name;
è¦ãç®ãã ãã¶ã¹ãããªãã¾ããããããããä»å¾å®ç¾©å¤æ´ããã£ãå ´åã«ã¯ããã·ã¼ã¸ã£å´ãç·¨éããã ãã§ãaccess_user_idsã使ã£ã¦ãããã¹ã¦ã®Queryãåãææ¨ã«ç½®ãæããã¾ãã
注æãã¹ãã¯ã1ã¯ã¨ãªã§ã¯ãªãã¹ã¯ãªããã«ãªããããåºåãæåã;ããå¿ è¦ã«ãªããã¨ã¨ãããã·ã¼ã¸ã£å ã§ä½ãããTEMP TABLEã®ååãRedashããã ã¨é è½ããã¦ãã¾ããããå½åè¦åãªã©ã§å¯¾å¿ããå¿ è¦ããããã¨ã§ãã
RedashããBigQueryã¹ã¯ãªããã使ããã¨ã®ã¡ãªããããã¡ãªããÂ
ããã·ã¼ã¸ã£ãå©ç¨ãããã¨ã®æ大ã®ã¡ãªããã¯ãRedashã®Queryãã¨ã«ææ¨ããã©ãããã¨ãªãä¸å 管çãããã¨ãã§ãããã¨ã§ããä»ã«ããBigQueryããRedashã«SQLãã³ãããã¦ãããæéç¯å²ã®é¨åã ãã {{start_time}} 㨠{{end_date}} ã§ç½®ãæãã¦â¦ãªãã¦ããæéãçããã¨ãã§ãã¾ããæçµçã«ã¯ããã·ã¼ã¸ã£ãä¸éãã¼ãã«ã ãã§ã¯ã¨ãªãä½æãã¦ãRedashããã¯CALLããã ãã¨ããéç¨ã«ããã°ãããã·ã¥ãã¼ããã¼ã«ã«ç¸ããããã¨ã®ãªãæªæ¥ãããããã§ãã
ãã¡ãªããã¯ãBigQueryã¹ã¯ãªããå ã§æ¸ãããSQLã«é¢ãã¦ãå¦çãããæ¨å®ãã¤ãæ°ãå®è¡åã«ããããªããã¨ã§ããæ³å®å¤ã«æ¤ç´¢è²»ç¨ãããã£ã¦ãã¾ããã¨ãããããããã¾ãããããã«é¢ãã¦ã¯ããã¨ãã°å ã«dry runãå®è¡ãã¦å¦çããæ¨å®ãã¤ãæ°ãç®åºããä¸å®ä»¥ä¸ã®è²»ç¨ãããããããªãå®è¡ããªããã¨ãããã¨ãã§ããããã«ãªãã°ãããããã§ãããï¼ç¾æç¹ã§ã¯ãã³ã³ã½ã¼ã«ä¸ããdry runããããã¨ã¯ã§ããªãããã§ããï¼
BigQueryã¹ã¯ãªããã§å¯è½ã«ãªããã¨ã¯ã¾ã ã¾ã ãããããªã®ã§ãä»å¾ã追ã£ã¦ã¿ããã¨æãã¾ãã
åè