BigQueryã®Standard SQLã使ã£ã¦ãã¦é åããä½ç½®ãæå®ãã¦è¦ç´ ãåãåºããã¨ã¯ã§ããã®ã§ãããUNNEST()ã使ã£ã¦ããããã®è¦ç´ ãåãåºããæã«ããè¦ç´ ãä½çªç®ãã¨ããæ å ±ãä¸ç·ã«å¾ãæ¹æ³ãããããªãã£ãã®ã§ããæ¹ãèãã¾ãã(ããããããç°¡åã«å¾ãæ¹æ³ãããããï¼)
ä½è«ã§ãããæè¿BigQueryã«è§¦ã£ã¦ãã¦æå¤ã¨SQLã§ãªãã§ãæ¸ãããªãã¨æãå§ãã¦ãã¾ãã(BigQueryã¯WITHã®å帰ãã§ããªãã®ãæ®å¿µã§ãã)
追è¨
æ·»åãå¾ãã ããªãç°¡åã«æ¸ãã¾ãã
sucrose.hatenablog.com
CROSS JOINã«ããæ¹æ³
é
åã®é·ããããããªãã®ã§GENERATE_ARRAY(1, 100)
ã®ããã«1ããååãªé·ãã¾ã§ã®é
åãä½ã£ã¦ãããã¨CROSS JOINãã¦içªç®ã®è¦ç´ ãåãåºãé¢æ°ã®SAFE_ORDINAL()
(0-indexedãªãSAFE_OFFSET()
)ã§é
åããè¦ç´ ãåãåºãã¾ã
#StandardSQL WITH data AS (SELECT SPLIT('a,b,c', ',') AS split) SELECT i , split[SAFE_ORDINAL(i)] AS token FROM data, UNNEST(GENERATE_ARRAY(1, 100)) AS i WHERE NOT split[SAFE_ORDINAL(i)] IS NULL ORDER BY i
åæ§ã«2ã¤ã®é
åã®å¤ãå¼ãã¦ããã°zipçã«ä½¿ããã¨ãã§ãã¾ã
#StandardSQL WITH data1 AS (SELECT SPLIT('a,b,c', ',') AS split) , data2 AS (SELECT SPLIT('ã,ã,ã,ã,ã', ',') AS split) SELECT i , data1.split[SAFE_ORDINAL(i)] AS token , data2.split[SAFE_ORDINAL(i)] AS token2 FROM data1, data2, UNNEST(GENERATE_ARRAY(1, 100)) AS i WHERE NOT data1.split[SAFE_ORDINAL(i)] IS NULL AND NOT data2.split[SAFE_ORDINAL(i)] IS NULL ORDER BY i
UDF(ã¦ã¼ã¶ã¼å®ç¾©é¢æ°)ã«ããæ¹æ³
BigQueryã®SQLã§ã¯JavaScriptã«ããUDFãåããã¾ã
以ä¸ã®ã³ã¼ãä¾ã®ããã«JavaScriptã®é¢æ°ãå®ç¾©ããã ãã§ç°¡åã«ä½¿ãåããã§ãã¾ã
#StandardSQL CREATE TEMPORARY FUNCTION array_with_index(x ARRAY<STRING>) RETURNS ARRAY<STRUCT<index INT64, value STRING>> LANGUAGE js AS """ return x.map( (value, index) => ({value, 'index': index + 1}) ); """; WITH data AS (SELECT SPLIT('a,b,c', ',') AS split) SELECT array_with_index(split) AS result FROM data
é åãzipããå ´åãåæ§ã«UDFãå®ç¾©ãã¦ä½¿ãã°ããã§ã(ååã¨ã«ã©ã åãæå®ããªãã¨ãããªãã®ã§ãã¾ãæ±ç¨çã«ã¯ãªããªãã§ãã)
#StandardSQL CREATE TEMPORARY FUNCTION array_with_index(x ARRAY<STRING>, y ARRAY<STRING>) RETURNS ARRAY<STRUCT<index INT64, value1 STRING, value2 STRING>> LANGUAGE js AS """ return x.map( (value, index) => ({'value1': value, 'value2': y[index], 'index': index + 1}) ); """; WITH data1 AS (SELECT SPLIT('a,b,c', ',') AS split) , data2 AS (SELECT SPLIT('ã,ã,ã,ã,ã', ',') AS split) SELECT array_with_index(data1.split, data2.split) AS result FROM data1, data2