ããã«ã¡ã¯ããã¬ã³ã調æ»ã©ãã®äºä¸å¯ä¹ï¼@inohiroï¼ã§ãã æ®æ®µã¯ãã¯ãã¯ãããã®æ¤ç´¢ãã°ãåºã«ããæ³äººåããã¼ã¿ãµã¼ãã¹ããã¹ã¿ããã®éçºãã åºåäºæ¥å¨è¾ºã®ãã¼ã¿åæãªã©ãæ å½ãã¦ãã¾ãã
Amazon Redshiftãªã©ã®ãã¼ã¿ãã¼ã¹ã«èç©ããããã°ãªã©ã®å¤§éã®ãã¼ã¿ã«å¯¾ãã¦ã æ¥æ¬¡ãé±æ¬¡ãªã©ã®åä½ã§ãããå¦çãè¡ã£ã¦ããæ¹ã¯å¤ãããã£ãããã¨æãã¾ãã ãã°ãªã©ãæ±ããããå¦çã§ã¯ãå¦ç対象ãè¨å¤§ã§ããã¨ã¢ããªã±ã¼ã·ã§ã³ã使ãã¡ã¢ãªãå¢å¤§ãã æ¯æ¸ãã¦ãã¾ãæããããããããã¼ã¿ã®æ±ãã«æ°ãã¤ããå¿ è¦ãããã¾ãã ãã¼ã¿ãã¼ã¹å ã§å®çµãããããå¦çãªãã°ããã¾ã§æ°ã«ããå¿ è¦ã¯ç¡ãããããã¾ãããã å¤é¨ã®ããã°ã©ã ãããã¼ã¿ãèªã¿åºãã¦å¦çããå ´åã¯ç¹ã«æ³¨æãå¿ è¦ã§ãã
ããã§èããããä¸ã¤ã®å·¥å¤«ã¨ãã¦ãå¦ç対象ãåå²ãã¦ãç¹°ãè¿ãã¦å¦çãè¡ãæ¹æ³ãæãããã¾ãã ä¸è¬çãªRDBMSãåããã«ã¼ã½ã«ã¨å¼ã°ããæ©è½ãå©ç¨ãããã¨ã§ãå¦ç対象ãåå²ãã¦å°ããã¤å¦çãããã¨ãã§ãã¾ãã
æ¬ç¨¿ã§ã¯ãç¹ã« Railsï¼ActiveRecordï¼ã使ã£ã¦æ¸ãããAmazon Redshiftãå©ç¨ãããããªãããå¦çã«ããã¦ã ã«ã¼ã½ã«æ©è½ãç°¡åã«å©ç¨ã§ãã "redshift_cursor" gem ãç´¹ä»ãã¾ãã
ã¾ãã«ã¼ã½ã«ã«ã¤ãã¦ãããå°ã詳ãã説æãããã¨æãã¾ãã
ããããã«ã¼ã½ã«ã£ã¦ï¼
ã«ã¼ã½ã«ã¯ãã¼ã¿ãã¼ã¹ãããã¼ã¿ãå¾ãããéã«ãä¸åº¦ã«ãã¹ã¦ã®ãã¼ã¿ãèªã¿åºãã®ã§ã¯ãªãã ããç¨åº¦ã®åä½ï¼è¡æ°ï¼ã«åãã¦èªã¿åºãããã®ä»çµã¿ã§ããã¤ãã¬ã¼ã¿ã®ããã«åä½ãããã¨ã§ã ã¢ããªã±ã¼ã·ã§ã³å´ã®ã¡ã¢ãªã®æ¯æ¸ãé²ããã¨ãã§ãã¾ãã
PostgreSQLãMySQLãªã©ã®ãä¸è¬çãªRDBMSã«ã¯ã«ã¼ã½ã«æ©è½ãåãã£ã¦ãã¦ãããã«ä½¿ããã¨ãã§ãã¾ãã ãã¡ããRedshiftã«ãããã¾ããããããã®RDBMSã®ã«ã¼ã½ã«ã«ã¤ãã¦ã¯ä»¥ä¸ãåç §ãã¦ãã ããã
- PostgreSQL: 40.7. ã«ã¼ã½ã« - PostgreSQL 9.5.3ææ¸
- MySQL: 13.6.6 ã«ã¼ã½ã« - MySQL 5.6 ãªãã¡ã¬ã³ã¹ããã¥ã¢ã«
- Redshift: DECLARE - ãã¼ã¿ãã¼ã¹éçºè ã¬ã¤ã
以ä¸ã¯ãPostgreSQLã§ãã«ã¼ã½ã«ã使ã£ã¦å¤§ããªçµæãã10è¡ãã¤èªãä¾ã§ãã
begin ; -- ã«ã¼ã½ã«ã¯ãã©ã³ã¶ã¯ã·ã§ã³ã®ä¸ã§ä½¿ã declare sample_cursor cursor for -- ã«ã¼ã½ã«ãå®£è¨ select title from recipes where title like '%ããã%' ; fetch 10 from sample_cursor ; -- æåã®10件ãå¾ã fetch 10 from sample_cursor ; -- 次ã®10件ãå¾ã -- å¿ è¦ãªã ãç¹°ãè¿ã close sample_cursor ; -- ã«ã¼ã½ã«ãéãã commit ;
Railsã«ãããã«ã¼ã½ã«çãªå¦ç
大éã®ã¯ã¨ãªçµæãå°ããã¤åãåºãã¦å¦çãè¡ãå ´åã
Railsã 㨠ActiveRecord::Batches
ã® .find_each
ã .find_in_batches
ãå©ç¨ããæ¹ãå¤ãã®ã§ã¯ãªãã§ããããã
.find_each
ããã³ .find_in_batches
ã®è©³ãã説æã¯å²æãã¾ããã
ãããã®ã¡ã½ããã使ãéã¯ã以ä¸ã®ç¹ã§æ³¨æãå¿
è¦ã§ãã
- ã½ã¼ãã«ã©ã ãæå®ã§ããªãï¼ãã©ã¤ããªãã¼ï¼å¤§æµ
id
ã«ã©ã ï¼ã§ã½ã¼ããããï¼- ç¹ã«ãã°ç³»ã®ãã¼ãã«ã ã¨
id
ã«ã©ã ãä»ãã¦ãªãã£ããããããããã©ã¤ããªã¼ãã¼ãè¨å®ããã¦ããªããã¨ããã - ã¾ããæ¥ä»ã«ã©ã ãã½ã¼ããã¼ã¨ãªã£ã¦ããå¯è½æ§ãé«ããæå³ããªããã¼ã§å¤§éã®è¡ãã½ã¼ããã¦ãã¾ããããããã
- ç¹ã«ãã°ç³»ã®ãã¼ãã«ã ã¨
- ãã©ã¤ããªãã¼ãå¿ ãæ°å¤åã§ããå¿ è¦ããã
- ãã£ã³ã¯æ¯ã«ã¯ã¨ãªãä½åº¦ãåå®è¡ããã
- ã«ã¼ã½ã«ã¯ã¯ã¨ãªãä¸åº¦ã ãå®è¡ããçµæããã£ã³ã¯ã«åãã¦è¿ã
ã½ã¼ãã«ã©ã ãæ示çã«æå®ããã¦ããªãç¶æ ã§ããã£ã³ã¯æ¯ã«ã¯ã¨ãªãåå®è¡ãããã¨ã å¾ãããçµæãæ£ãããªãå¯è½æ§ãèãããã¾ãã ã¾ããã£ã³ã¯æ°åãåããããªã¯ã¨ãªãçºè¡ãããã®ã§éå¹çã¨ãè¨ãã¾ãã 以ä¸ã®ãã¨ãããã«ã¼ã½ã«ãå©ç¨ããæ¹ãããã©ã¼ãã³ã¹ãä¿¡é ¼æ§ã®é¢ã§è¯ãã¨è¨ãã¾ãã
redshift_cursor
ãã¦ãã«ã¼ã½ã«ã«ã¤ãã¦ç°¡åã«èª¬æãã¾ãããããããããæ¬é¡ã§ãã ä»åç´¹ä»ããredshift_cursor gemã¯ã Railsï¼ActiveRecordï¼ã§Redshiftã«æ¥ç¶ãã¦å¤§éã®è¡ãå¾ããããªå ´åã«ã ã«ã¼ã½ã«ã®æ§æãè¦ããªãã¦ããã«ã¼ã½ã«ãééçã«å©ç¨ã§ããããã«ããgemã§ãã redshift_cursorã¯å®éã«ã¯ãã¯ãããã®ä¸é¨ã®ãããã¸ã§ãã§ããã°ã®éè¨ãã¦ã¼ã¶ã¼ã®æ½åºã«å©ç¨ããã¦ãã¾ãã
以ä¸ããã® gem ã®ä½¿ãæ¹ãç°¡åã«èª¬æãã¾ãã
ã¾ã Gemfile ã«è¨è¿°ã㦠bundle install
ãã¾ãã
# Gemfile gem 'redshift_cursor'
ããã¨ãåã¢ãã«ã§.each_row
, .each_instance
, .each_row_by_sql
, .each_instance_by_sql
ãªã©ã®ã¡ã½ããã使ããããã«ãªãã¾ãã
.each_row
, .each_row_by_sql
ã¯çµæãããã·ã¥ã®é
åã§ã.each_instance
, .each_instance_by_sql
ã¯çµæãã¬ã·ã¼ãã¼ã¯ã©ã¹ã®ã¤ã³ã¹ã¿ã³ã¹ã®é
åã§è¿ãã¾ãã
Recipe.where(id: 3199605).each_row.fitst => {"id"=>"3199605", "title"=> "ç°¡å çå°ãªãï¼ããã¢ããã·ã¥", ... } Recipe.where(id: 3199605).each_instance.first => #<Recipe:0x007fe5260eeaa8 id: 3199605, title: "ç°¡å çå°ãªãï¼ããã¢ããã·ã¥", ...>
ãããã®ã¡ã½ãã㯠Enumerable ãè¿ãã®ã§ãçµæã«å¯¾ã㦠.map
ã .each
ãªã©ä½¿ããã¨ãã§ãã¾ãã
以ä¸ãå©ç¨ä¾ã§ãã
# ã¿ã¤ãã«ããã»ãããèãã«ããããããããªã¬ã·ã Recipe.where('title like ?', '%ã»ãããè%').each_insntace.map {|recipe| recipe.title } # å¿ è¦ãªã«ã©ã ãã¿ã¤ãã«ã ããªã Recipe.where('title like ?', '%ããã%').select(:title).each_row.map {|recipe| recipe['title'] ... } # æ¡ä»¶ãé åºãSQLã§æ¸ã Recipe.each_instance_by_sql('select * from recipe where ... order ...').map {|recipe| recipe.created_at } # ãã¢ããã¥ã¡ã³ãã§SQLãæ¸ã SearchLog.each_row_by_sql(<<~SQL select title , count(*) as pv from search_logs where keyword like '%ãºããã¼ã%' and log_time between ... group by title SQL ).each {|log| log['pv'] ... }
ï¼.each_row_by_sql
, .each_instance_by_sql
ã§ã¯ Array Condition ã使ããªããã¨ã«æ³¨æï¼
ä¸è¨ã®ã³ã¼ãã¯ãããããã«ã¼ã½ã«ã使ã£ãã¯ã¨ãªã«æ¸ãæãããã å©ç¨è ã¯ã«ã¼ã½ã«ã®æ§æãä»çµã¿ãæ°ãã«è¦ããªãã¦ãã大éã®ãã¼ã¿ãå¹ççã«æ±ããã¨ãã§ãã¾ãã
å®è£
å®ã¯redshift_cursorã®å¤§é¨åã¯ãpostgresql_cursor gemãå©ç¨ãã¦ãã¾ãã redshift_cursorã¯ãactiveRecord4-redshift-adapterã使ã£ã¦RailsããRedshiftã«æ¥ç¶ãã¦ããæã«ãpostgresql_cursorãæ£ãã使ããããã«äºææ§ã追å ãã¦ãã¾ãã
ã¾ã¨ã
æ¬ç¨¿ã§ã¯ããããå¦ççã§å¤§éã®ãã¼ã¿ãèªã¿è¾¼ãéã«ã¢ããªã±ã¼ã·ã§ã³å´ã®è² è·ãããã©ã¼ãã³ã¹ãæ¹åãã ã«ã¼ã½ã«ã«ã¤ãã¦èª¬æãã¾ããã
ã¾ãRailsï¼ActiveRecordï¼ã使ã£ã¦æ¸ãããAmazon Redshiftãå©ç¨ãããããªãããå¦çã«ããã¦ã ã«ã¼ã½ã«æ©è½ãç°¡åã«å©ç¨ã§ãã "redshift_cursor" gem ãç´¹ä»ãã¾ããã ãpostgresql_cursorãã¨ããããåºæ¥ãPostgreSQLåãã®gemãRedshiftã§ã使ããããã«ããgemã§ãã
Gemfileã«è¿½å ãã¦ãActiveRecordã®ã¡ã½ããã¨ããä¼¼ãã¡ã½ããã§ã ã«ã¼ã½ã«ã使ã£ãã¯ã¨ãªãç°¡åã«çºè¡ãããã¨ãã§ãã¾ãã ãã²ãã°ãªã©ã®å¤§éã®ãã¼ã¿ããããå¦çããã¨ãã«ãæ´»ç¨ãã ããã
追è¨ï¼2016/07/12ï¼
ï¼ãã³ã¡ã§ããææããã ãã¦ããã¾ãããï¼10ä¸è¡ãè¶ ãããããªçµæãèªã¿åºããªãã°ã ã«ã¼ã½ã«ã§ã¯ãªã UNLOAD ã³ãã³ãã使ãã¾ãããã
社å ã§ããã®ããªã·ã¼ã§éç¨ãã¦ãã¾ãã