BigQueryã«BigQuery MLã¨ããæ©æ¢°å¦ç¿ã®æ©è½ã追å ããã¾ãããä»ã¯ãã¸ã¹ãã£ãã¯å帰ã«ããåé¡ã¨ç·å½¢å帰ã«ããå帰åé¡ã«ã¤ãã¦å¦ç¿ãã§ããããã§ã
å¦ç¿æã«å¤æ°ã«å¯¾ãã¦ã©ã®ãããªåå¦çãè¡ãããã®ãæ°ã«ãªã£ãã®ã§èª¿ã¹ã¦ã¿ã¾ãã
誤ããªã©ããã£ããã³ã¡ã³ãã§ææããã ããã¨å¬ããã§ã
åå¦ç
The CREATE MODEL Statement | BigQuery | Google Cloud
ä¸ã®ããã¥ã¡ã³ãã«ããã¨æ°å¤åã®INT64ãFLOAT64ãNUMERICã¯åå¦çã¨ãã¦æ¨æºå(å¹³åã0ã§åæ£ã1ã«å¤æ)ããã¦ããå©ç¨ããã¾ã
ã«ãã´ãªã«ã«ãªå(BOOLãSTRINGãBYTESãDATEãDATETIMEãTIMEå)ã¯one hot encodngãããããã§ã(å¤ã®ç¨®é¡æ°ã ã次å
ããã£ã¦ã該å½ãããã®ã ã1ã§ã»ãã¯0ã«ãªã)
TIMESTAMPåã¯ãã®ã¾ã¾ã ã¨å¦ç¿ã§ããªãã¦ãæååãæ°å¤ã«å¤æããå¿
è¦ãããã¾ã
ä»ã®æARRAYãSTRUCTã¯å¦ç¿ã«ä½¿ããªãããã§ã
å¦ç¿ãããéã¿ã¯æ¨æºåå¾ã®å¤ã«å¯¾ããéã¿ãªã®ã
ML.WEIGHTS関数ã使ãã¨å¦ç¿ãããã¢ãã«ã®éã¿ãè¦ããã¨ãã§ãã¾ã
ãã®å¤ã¯æ¨æºåãé©ç¨ããå¾ã®å¤æ°ã«å¯¾ããå¤ãªã®ãã©ãããæ°ã«ãªã£ãã®ã§ç¢ºèªãã¦ã¿ã¾ãã
é常ã«ç°¡åãªç·å½¢å帰ã®å¦ç¿ããã¾ã
\(y = a + 2b\)ã¨ããé¢ä¿ãå¦ç¿ããã¾ãã
#standardSQL CREATE OR REPLACE MODEL `for_blog_article.coef_model1` OPTIONS(model_type='linear_reg') AS SELECT * FROM ( SELECT 1 AS a, 0 AS b, 1 AS label UNION ALL SELECT 0 AS a, 1 AS b, 2 AS label UNION ALL SELECT 1 AS a, 1 AS b, 3 AS label UNION ALL SELECT 0 AS a, 0 AS b, 0 AS label )
ãã®ã¯ã¨ãªã§å¦ç¿ããã¢ãã«ã«å¯¾ãã¦ä»¥ä¸ã®ãããªã¯ã¨ãªã§éã¿ã確èªãã¾ã
#standardSQL SELECT * FROM UNNEST( ARRAY( SELECT AS STRUCT * FROM ML.WEIGHTS(MODEL `for_blog_article.coef_model1`) ) )
ããã¨ä»¥ä¸ã®ãããªçµæãå¾ããã¾ãã
è¦ã¦ã®éãæ°å¤åã®å ´åå¾ãããéã¿ã¯æ¨æºåå¾ã®éã¿ã«å¯¾ãããã®ã®ããã§ããã
ãããBigQuery MLã§å¦ç¿ããéã¿ãå¤ã«åºãã¦èªåã§è¨ç®ããã¨ãã«ã¯å¦ç¿æã«ãã£ããã®ã¨åæ§ã®åå¦çãè¡ãå¿
è¦ãããããã§ã
ML.FEATURE_INFO関数ã使ãã¨ãå¦ç¿æã®å¹³åãæ¨æºåå·®ãªã©ãè¦ããã¨ãã§ãã¾ã
#standardSQL SELECT * FROM ML.FEATURE_INFO(MODEL `for_blog_article.coef_model1`)
nullã®æ±ãã¯ã©ããªã£ã¦ãã®ãï¼
ä¸ã§å¦ç¿ããã¢ãã«ã«ã¤ãã¦è©¦ãã«ãã¹ã¦ã®å¤æ°ãnullã«ãã¦äºæ¸¬ããã¦ã¿ã¾ãã
SELECT * FROM ML.PREDICT( MODEL `for_blog_article.coef_model1`, ( SELECT null AS a, null AS b ) )
å¦ç¿æã®å¹³åå¤ã§ãã0.5ãæ¬ æå¤ã§ããnullã®ä»£ããã«æ¿å
¥ãã¦åé¡ãã¦ããããã§ã
åæ§ã«å¦ç¿ããæãåæ§ã«æ¬ æå¤ã«å¹³åå¤ã代å
¥ãã¦å¦ç¿ãã¦ãããã§ã
軽ã試ãã¦ã¿ãã¨ãããã¾ãä¸é¨ã®ã«ã©ã ãnullã®è¨ç·´ãã¼ã¿ãå ããã¨å¦ç¿çµæãå¤ããã®ã§ç¡è¦ããã¦ã¯ããªãããã¨ãããã¨
æ´ã«ä»¥ä¸ã®ããã«nullã®ã«ã©ã ã«å¹³åå¤ã代å
¥ãããå ´åã®çµæã«çããå¤ãlabelã«å
¥ããå ´åã«ãå®éã«å¹³åå¤ã®ãã¼ã¿ã§å¦ç¿ããå ´åã¨å¦ç¿çµæã¯å¤ãã£ã¦ããªãããã¨ãããã¨ãããã£ãã®ã§ãå¦ç¿æã«ãnullã®ä»£ããã«å¹³åå¤ã使ããã¦ãããã§ã
CREATE OR REPLACE MODEL `for_blog_article.coef_model2` OPTIONS(model_type='linear_reg') AS SELECT * FROM ( SELECT 1 AS a, 0 AS b, 1 AS label UNION ALL SELECT 0 AS a, 1 AS b, 2 AS label UNION ALL SELECT 1 AS a, 1 AS b, 3 AS label UNION ALL SELECT 0 AS a, 0 AS b, 0 AS label UNION ALL SELECT null AS a, 0 AS b, 0.5 AS label )
one hot encodingã®ã¨ãã®nullã®æ±ãã¯ï¼
äºæ¸¬æã«ã«ã©ã ãnullã®ãã¼ã¿ã渡ãããã¨ã©ã¼ã«ãªãã¾ãã
以ä¸ã®ãããªåãåããªãã¨ããã¨ã©ã¼ã¡ãã»ã¼ã¸ãªã®ã§ä½ãå¥ã®ã¨ã©ã¼ãè¸ãã§ãããããã¾ãã
Invalid table-valued function ML.PREDICT
Model ãã¼ãã«å cannot accept column a due to type mismatch. Data type INT64 does not match what model expects.
ã¡ãªã¿ã«one hot encoding ã¨ãã¦æ±ãããæååã®ã«ã©ã ã§ãäºæ¸¬ããã¨ãã«åå¨ããªãå¤ãå ¥ããå ´åã«ã¯ã¡ããã¨ãã®ã«ã©ã ã®å¤ã¯ä½¿ãããã«è¨ç®ããã¾ãã
å¦ç¿æã«ã¯å¦ç¿ãããéã¿ã«_null_filter
ã¨ããã®ãå¢ãã¦ããã®ã§nullã¯nullã¨ãã¦å¦ç¿ãã¦ããããªé°å²æ°ãããã¾ã
ãã¾ã: å¤éå ±ç·æ§ï¼
以ä¸ã®ããã«äºéãã®æååãåãã«ã©ã ãå¦ç¿ããã¾ã
CREATE OR REPLACE MODEL `for_blog_article.coef_model3` OPTIONS(model_type='linear_reg') AS SELECT * FROM ( SELECT 'x' AS a, 1 AS label UNION ALL SELECT 'y' AS a, 2 AS label )
ããã¨ä»¥ä¸ã®ãããªéã¿ãå¾ããã¾ãã
ã«ãã´ãªã«ã«ãªå¤æ°ãone hot encodingã«ããå ´åããã®ãã¡ã®æ¬¡å
ãä¸ã¤åããªãã¨å¤éå
±ç·æ§ã®åé¡ãèµ·ãã£ã¦ãã¾ã£ã¦éã¿ã®å¤§å°é¢ä¿ãè¦ãå ´åã«æªå½±é¿ãåºã¦ãã¦ãã¾ããããããªããã¨ãã話ããã¾ã«èãã¾ããä¸ã®çµæã ãè¦ãã¨ããããåãå¦çã¯ãã£ã¦ãªãããã§ã