ã¯ããã«
ãååã®è¨äºã§ã¯ãPerlã§CSVãæ±ãããã®åºæ¬çãªèãæ¹ã説æãã¾ãããæ¬è¨äºã§ã¯å¼ãç¶ããPerlã§CSVãã¡ã¤ã«ãéè¨ããããã®ãã¦ãã¦ãç´¹ä»ãã¾ãã
対象èªè
- Perlã®ããåæ©çãªç¥èï¼é åãããã·ã¥ï¼ãæãã¦ããæ¹ã
- Perlãå©ç¨ãã¦CSVå½¢å¼ã®ãã¼ã¿ãéè¨ãããæ¹ã
å¿ è¦ãªç°å¢
- ããã¹ãã¨ãã£ã¿ã
- Perl 5.8.Xããã ããã»ã¨ãã©ã®ã³ã¼ãã¯ãã以ä¸ã®ãã¼ã¸ã§ã³ã§ãåãã¾ãã
ãç°å¢ãæ´ããæ¹æ³ã«é¢ãã¦ã¯ãååã®è¨äºãåç §ãã¦ä¸ããã
2ã¤ã®CSVãæ±ã
ãã¼ã«ãã£ã¦JOINãã
ã顧客ã«ã¢ã³ã±ã¼ããã¨ã£ãçµæãæ ¼ç´ãããã次ã®ãããªãenquate.csvããããã¨ãã¾ããå·¦ããé ã«ã
顧客ID,è¨å1ã®åç,è¨å2ã®åç,è¨å3ã®åç
ãã¨ãããããª4ã«ã©ã ã®ãã¼ã¿ã§ãã
00893,1,4,2 89204,4,2,3 75648,2,2,2 ãï¼ ãï¼ï¼ä»¥ä¸ã30,000ä»¶ï¼ ãï¼
ãããªãã¯ããã®ãã¼ã¿ã®åè¡ã®å¾ãã«é¡§å®¢ã®æ°åã¨ä½æãç´ã¥ãããã¬ã¼ã³ãçºéæ å½è ã«æ¸¡ããªããã°ãªãã¾ãããã©ãããã°ããã®ãããªå¦çãã§ããã§ããããï¼ é¡§å®¢ã®ãã¼ã¿ã¯ãaddress.csvãã¨ããååã§ã
顧客ID,æ°å,ä½æ
ãã®3ã«ã©ã ãããªã£ã¦ãã¾ãã
02547,ä½è¤å¤§è¼,åæµ·éè«å°ç§å¸XXXXYYYY 15983,ç°ä¸ä¹ å¿,æ²ç¸çé£è¦å¸XXXXYYYY 00893,æ¬éé æ´,ç¥å¥å·ç横æµå¸XXXXYYYY ãï¼ ãï¼ï¼ä»¥ä¸ã100,000ä»¶ï¼ ãï¼
絶対ã«ãã£ã¦ã¯ãããªãã³ã¼ãã£ã³ã°
ãããæ°äººã¯ã³ããã®ä»äºãå¼ãåãã次ã®ãããªã³ã¼ããæ¸ãã¦å®è¡ãã¾ããã
open(OUT, '>result.csv'); open(IN1, 'enquate.csv'); # ã¢ã³ã±ã¼ããã¼ã¿ã1è¡ãã¤å¦çãã while(my $line1 = <IN1>){ # 1è¡ã4ã¤ã«åãã chomp($line1); my ($id, $ans1, $ans2, $ans3) = split(/,/, $line1, 4); # ãã®è¡ã«ããããã顧客ãã¼ã¿ãæ¤ç´¢ãã my $name = ''; my $address = ''; open(IN2, 'address.csv'); while(my $line2 = <IN2>){ chomp($line2); my ($tmp_id, $tmp_name, $tmp_address) = split(/,/, $line2, 3); if($tmp_id eq $id){ # 対象ã¨ãªã顧客ãè¦ã¤ãã£ãï¼ $name = $tmp_name; $address = $tmp_address; last; } } close(IN2); # åºåãã print OUT join(',', $id, $ans1, $ans2, $ans3, $name, $address), "\n"; } close(OUT); close(IN1);
ãçµæãåºããã¡ã¼ã«ã§éä¿¡ãã¦ãä»æ¥ã¯ãã£ãã¨å®æã§å¸°ããã¨æã£ã¦ããæ°äººã¯ã³ãã¨ãããããã¤ã¾ã§å¾ ã£ã¦ããã®å¦çã¯çµããå ããè¦ãã¾ãããå¦çãçµããã¾ã§ã®å¾ ã¡æéãå©ç¨ãã¦ã³ã¼ãã1è¡ãã¤ä½åº¦ãè¦ç´ããã®ã§ãããè«ççã«ã¯ééãã¦ããªãã³ã¼ãã£ã³ã°ã«è¦ãã¾ãããã£ãããªãçµãããªãã®ã§ãããï¼
30ååã®ã«ã¼ãå¦ç
ãæ°äººã¯ã³ã®ã¹ã¯ãªããããªããªãçµäºããªãã£ãåå ã¯ããã¡ã¤ã«ã®èªã¿è¾¼ã¿å¦çã2éã®ã«ã¼ãã«ãªã£ã¦ãããã¨ã«ããã¾ããå¤å´ã®ãenquate.csvãã®ä»¶æ°ã3ä¸ä»¶ã§ããã®1è¡1è¡ã«ã¤ãã¦ãaddress.csvããæ¯åéãã¦ãã¼ã¿ãæ¤ç´¢ãã¦ãã¾ãããaddress.csvãã®è¡æ°ã10ä¸è¡ã§ãã®ã§ãåç´è¨ç®ã§10ä¸è¡×3ä¸åï¼30åè¡ããã¡ã¤ã«ããèªã¿è¾¼ãå¦çã§ããã¨ãããã¨ã«ãªãã¾ãï¼ãã ããæ¤ç´¢ãå®äºãããlast
ãã¦ããã®ã§ãå®éã«ã¯ãã£ã¨å°ãªãè¡æ°ã§æ¸ã¿ã¾ããï¼ã
ããããªåæ°ãã«ã¼ãå¦çãã¦ããã°ãå¦çãçµãããªãã®ã¯å½ããåã§ããçè ã®ç°å¢ã§ã¯1æéå¾ ã£ã¦ãå¦çãçµãããªãã£ãã®ã§ãã¨ãã¨ãï¼»Ctrlï¼½ï¼ï¼»Cï¼½ã§å¼·å¶çµäºãã¦ãã¾ãã¾ããã
ããã·ã¥ãæå¹ã«ä½¿ã
ãã§ã¯ããã®å¦çãç¾å®çãªæéã§çµããããã«ã¯ã©ãããã°ããã®ã§ããããï¼ ãã®ããã«ã¯ããaddress.csvããã対象è¡ãå¼ã³åºãå¦çãé«éåããå¿ è¦ãããã¾ãã
ãååã®è¨äºã§ãè¿°ã¹ãå 容ã§ãããããã§ãåãæ±ããã¡ã¤ã«ã®å¤§ãããåé¡ã¨ãªã£ã¦ãã¾ããä»åæ±ã£ã¦ãããaddress.csvãã¯5MBç¨åº¦ã®å¤§ããã§ãã®ã§ãå ¨ã¦ã¡ã¢ãªã«åãè¾¼ãã§ã大ããªåé¡ã«ãªããã¨ã¯ãªãããã§ããããã§ã3ä¸åç¹°ãè¿ãã¦ãããaddress.csvãã®èªã¿è¾¼ã¿å¦çãæåã«æã£ã¦ãã¦ãå ¨ã¦ã¡ã¢ãªã«æã¤ããã«å¤æ´ãã¦ã¿ã¾ããããã§ãã¡ã¤ã«ã®èªã¿è¾¼ã¿è¡æ°ã¯30åè¡ãã10ä¸è¡ã«æ¸ãã¯ãã§ãã
ãæ¤ç´¢ã®é«éåã«ã¯ããä¸ã¤éè¦ãªãã¨ãããã¾ããããã¯ãã¡ã¢ãªä¸ã®ãã¼ã¿ãé«éã«æ¤ç´¢ããããã®ç´¢å¼ãã¤ã¾ããã¤ã³ããã¯ã¹ãå¼µããã¨ã§ããã¤ã³ããã¯ã¹ãç¨æããã¦ããªããã°ã対象ã¨ãªã顧客IDã«å¯¾å¿ããè¡ãæ¢ãã®ã«ãæ¯åå ¨ãã¼ã¿ãèµ°æ»ããªããã°ãªããªããªãã¾ãããããã¯ãã¾ãã«éå¹ççéãã¾ãã
ãPerlã«ããã¦ããã®ã¤ã³ããã¯ã¹ã¨ãã¦é©ä»»ãªã®ã¯ããã·ã¥ã§ãããã¼ã«å¯¾ããå¤ããé«éã«æ¢ãåºããã¨ãã§ããããã§ããä»åã®ä¾ã§ã¯ã顧客IDãã対象ã¨ãªãè¡ãã¦ãã¼ã¯ã«å®ã¾ãã¾ãã®ã§ã顧客IDããã¼ã¨ããããã·ã¥ã«ãaddress.csvãã®ãã¼ã¿ãä¿åãã¾ãã
ãã³ã¼ãã¯ä»¥ä¸ã®ããã«ãªãã¾ããã
# æåã«ã顧客ã®ä½æãå ¨ã¦ããã·ã¥(ã¡ã¢ãªä¸)ã«åãè¾¼ã my %address_datas = (); open(IN, 'address.csv'); while(<IN>){ chomp; my ($id, $name, $address) = split(/,/, $_, 3); # (â»1)顧客IDããã¼ã¨ãã対å¿ããååã¨ä½æã®é åãä¿åãã $address_datas{$id} = [$name, $address]; } close(IN); # ã¢ã³ã±ã¼ããã¼ã¿ã«æ°åã¨ä½æããã¼ã¸ãã open(OUT, '>result.csv'); open(IN, 'enquate.csv'); while(my $line = <IN>){ chomp($line); my ($id, $ans1, $ans2, $ans3) = split(/,/, $line, 4); # ãã®è¡ã«ããããã顧客ãã¼ã¿ãæ¤ç´¢ãã my $ref_data = $address_datas{$id}; # é åã®0çªç®ã«ååã1çªç®ã«ä½æãå ¥ã£ã¦ãã (â»1ãåç §) my $name = $ref_data->[0]; my $address = $ref_data->[1]; # åºåãã print OUT join(',', $id, $ans1, $ans2, $ans3, $name, $address), "\n"; } close(OUT); close(IN);
ããã®ã³ã¼ããçè ã®ç°å¢ã§å®è¡ããã¨ã5ç§ãç«ããªããã¡ã«çµäºãã¾ããããããªãå®ç¨ä¸ã¾ã£ããåé¡ããªãããã§ãã
ããã®ä¾ã®ããã«ãç¹å®ã®IDé ç®ãå«ãè¤æ°ã®ãã¡ã¤ã«ããã®IDé ç®ã§JOINããå ´åã«ã¯ãIDé ç®ããã¼ã¨ããããã·ã¥ã¸JOINããããã¡ã¤ã«ã®ãã¼ã¿ãèªã¿è¾¼ãã§ããã¨ãå¹çããå¦çãããã¨ãã§ãã¾ããééãã¦ããæåã®ã³ã¼ãã®ããã«2éã«ã¼ããä½ã£ã¦æ¤ç´¢ãããã¨ã®ãªãããã«æ°ãã¤ãã¾ãããã
ã(追è¨ï¼ãã®ã³ã¼ãã«é¢ãã¾ãã¦ãå¼¾ãããããã©ãã¯ããã¯ãé ãã¦ããã¾ãããããã¦ãåç §ãã ããã)
2ã¤ã®CSVã«å ±éããIDãçå´ã«ãããªãID
ã次ã¯ã2ã¤ã®CSVã®å å«é¢ä¿ã調ã¹ãå¦çã§ãããenquate.csvãã¨åããã©ã¼ãããã®ãenquate2.csvãã¨ãããã¡ã¤ã«ãããã¾ãããã®ãã¡ã¤ã«ã¯2åç®ã«ã¨ã£ãã¢ã³ã±ã¼ãã§ããã1åç®ã¨ã¯éãã¦ã¼ã¶ãåãã¦ã¼ã¶ãæ··ãã£ã¦ãã¾ãããã®2ã¤ã®ãã¡ã¤ã«ãèªã¿è¾¼ãã§ãã1åç®ã ãå¿åãã顧客ãã2åç®ã ãå¿åãã顧客ããã©ã¡ããå¿åãã顧客ãã調ã¹ã¦ã¿ã¾ããããåºåã¯2ã«ã©ã ã§ã
顧客ID,1 or 2 or 3 ï¼1: 1åç®ã ãå¿åã2: 2åç®ã ãå¿åã3: ã©ã¡ããå¿åï¼
ãã¨ãã¾ãããenquate2.csvãã®è¡æ°ã¯ããenquate.csvãã¨åã30,000è¡ã§ãã
ããã®å¦çãå®ç¾ããããã«ãä»åã¯ä»¥ä¸ã®æé ãã¨ãã¾ãã
- ãenquate2.csvããããã·ã¥ã«åãè¾¼ã
- ãenquate.csvãã1è¡ãã¤èªã¿è¾¼ã
- ããã·ã¥ã«ãã¼ã¿ããªããã°ãã1åç®ã ãå¿åãã顧客ã
- ããã·ã¥ã«ãã¼ã¿ãããã°ããã©ã¡ããå¿åãã顧客ã
- ããã·ã¥ããã該å½ãã¼ã¿ï¼1åç®ã«å¿åãã顧客ã®ãã¼ã¿ï¼ãæ¶ã
- æå¾ã«ãæ¶ãããã«æ®ã£ããã¼ã¿ãã2åç®ã ãå¿åãã顧客ã
ã対称æ§ããªãã¦å¤å°ãããã«ããå¦çã§ã¯ããã¾ããããã®æ¹æ³ã ã¨ã¡ã¢ãªã«èªã¿è¾¼ãã®ã¯çæ¹ã®ãã¡ã¤ã«ã ãã§æ¸ã¿ã¾ãã®ã§ã大ããã®ãã¡ã¤ã«ã§ããªãã¨ããªãã®ãå©ç¹ã§ããã³ã¼ãã¯ã以ä¸ã®ããã«ãªãã¾ãã
# 2åç®ã®ã¢ã³ã±ã¼ãå¿åè ãããã·ã¥ã«åãè¾¼ã my %enq2_data = (); open(IN, 'enquate2.csv'); while(<IN>){ chomp; # IDã ãå¿ è¦ãªã®ã§ãIDã ãåãåºã my ($id) = split(/,/, $_, 2); # ãã¼ã顧客IDã¨ããç®å°(ãã©ã°)ã¨ãã¦1ãå ¥ãã $enq2_data{$id} = 1; } close(IN);
ãã¾ãæåã«ã2åç®ã®ã¢ã³ã±ã¼ãçµæãããã·ã¥ã«åãè¾¼ã¿ã¾ããä»åå¿
è¦ãªãã¼ã¿ã¯IDã ãã§è¨åã®åçé¨åã¯å¿
è¦ããã¾ããã®ã§ãsplit
ã®å¼ãæ°ã®æå¾ã«2ãæå®ãã左辺ã§ã¯$id
ã ããæå®ãããã¨ã§æåã®ã«ã©ã ãã¤ã¾ã顧客IDã ããåãåºãã¦ãã¾ãã
ãã¾ããããã·ã¥ã®ãã¼ã¯é¡§å®¢IDã¨ãã¦ã¾ãããããã·ã¥ã®å¤ã¯ç¹ã«å©ç¨ããªãã®ã§ãããã¼ã®å¤ã¨ãã¦1ãä»£å ¥ãããã¨ã«ãã¾ããã
ãããã·ã¥ã«ãã¼ã¿ãæ´ãã°ã次ã¯1åç®ã®ã¢ã³ã±ã¼ãçµæãèªã¿è¾¼ãã§æ¯è¼ããé¨åã¨ãªãã¾ããæåã®é¨åã¨åãããã«ãè¡ãsplit
ããé¨åã§ã¯é¡§å®¢IDã ããåãåºãã¾ãããã®å¾ãããã·ã¥ã«å«ã¾ãã2åç®å¿åè
ã®é¡§å®¢IDãæ¤ç´¢ãããã¼ãåå¨ããããã§ããã°ã©ã¡ãã«ãå¿åãã顧客ã¨ãã¦ã3ããåºåãã¾ããã¾ããdelete
é¢æ°ã«ãã£ã¦ããã·ã¥ãããã®é¡§å®¢IDãåé¤ãã¾ããããã«ããã2åç®ã®ã¢ã³ã±ã¼ãå¿åè
ãã1åç®ãå¿åãã顧客IDãåé¤ãããæçµçã«ã¯2åç®ã®ã¿å¿åããå¿åè
ãæ®ãã¨ãããã¨ã«ãªãã¾ãã
ã2åç®å¿åè ã®ããã·ã¥å ã«ãã¼ãåå¨ããªããã°ã1åç®ã®ã¢ã³ã±ã¼ãçµæã«ããå«ã¾ããªãã£ã顧客IDã¨ãããã¨ã«ãªãã¾ãã®ã§ãã1ããåºåãã¾ãã
# 1åç®ã®ã¢ã³ã±ã¼ãå¿åè ã¨2åç®ã®å¿åè ãæ¯è¼ãã open(OUT, '>result.csv'); open(IN, 'enquate.csv'); while(<IN>){ chomp; # IDã ãå¿ è¦ãªã®ã§ãIDã ãåãåºã my ($id) = split(/,/, $_, 2); if($enq2_data{$id}){ # ã©ã¡ãã«ãå¿åãã顧客 print OUT "$id,3\n"; # ããã·ã¥ããã1åç®å¿åãã顧客ããåé¤ delete $enq2_data{$id}; }else{ # 1åç®ã ãå¿åãã顧客 print OUT "$id,1\n"; } } close(IN);
ãããã¾ã§ã§ãã1åç®ã®ã¿å¿åãã顧客ãã¨ãã©ã¡ãã«ãå¿åãã顧客ãã®åºåãçµããã¾ãããå¾ã¯ãããã·ã¥ã«æ®ã£ã¦ããã2åç®ã®ã¿å¿åãã顧客ããåºåãããã°ãå¦çã¯å®äºã¨ãªãã¾ãã
# æå¾ã«ãããã·ã¥ã«æ®ã£ãã®ã2åç®ã®ã¿å¿åãã顧客 foreach(keys %enq2_data){ print OUT "$_,2\n"; } close(OUT);
ãä»åã®ä¾ã§ã¯ã1ã¤ã®ãã¡ã¤ã«ã«å
¨ã¦ã®çµæãåãã¾ããããã°ã«ã¼ããã¨ã«3ã¤ã®ãã¡ã¤ã«ã«åããã¨ããåºåæ¹æ³ãèãããã¾ãããã®å ´åã¯ãOUT1
ãOUT2
ãOUT3
ã®ç¨ã«ãã¡ã¤ã«ãã³ãã«ãè¤æ°ç¨æãã¦open
ããé©åãªãã³ãã«ã«åãã¦çµæãåºåãããããã«ããã¨ããã§ãããã