URIã¢ã¸ã¥ã¼ã«ã«utf8ãã©ã°ã¤ãã®æååãé£ããããquery_formã®ã¨ã³ã³ã¼ããåããã§ãããã®å·»
ã¿ã¤ãã«é·ããã£ã¦ããããã®ã¾ãã¾ã§ãã
#!/usr/bin/perl use strict; use warnings; use URI; my $s = 'http://example.com/?q=%82%e2%82%e9%95v%82%c5%8aw%82%d4'; utf8::upgrade($s); my $uri = URI->new($s); my %qf = $uri->query_form; $qf{flag} = 'ON'; $uri->query_form( %qf ); my $uri_str = $uri->as_string;
ãããä¸è¦ããã¨$uri_strã¯
http://example.com/?q=%82%e2%82%e9%95v%82%c5%8aw%82%d4&flag=ON
ã«ãªããããªãã§ãããå®éã«ã¯
http://example.com/?q=%C2%82%C3%A2%C2%82%C3%A9%C2%95v%C2%82%C3%85%C2%8Aw%C2%82%C3%94&flag=ON
ã«ãªãã¾ã*1ã
ããã§ãqã®å¤ã§ããã%82%e2%82%e9%95v%82%c5%8aw%82%d4ãã¯ãShiftJISã®æååã§ã*2ã
ãããããä½ã§ãããã
utf8::upgrade($s);
ãã¦ãã®ï¼ã¨ãæãã§ãããããããå±é¢ãæ¬ä¼¼çã«åç¾ããããã»ãããããã®ã§ããã©ãããå±é¢ãã¨ããã¨ã
- XMLã®ããå¤ã«ShiftJISãã¨ã³ã³ã¼ãããã¦URLãã©ã¡ã¼ã¿ã«å«ã¾ãã¦ãã
- XMLãXML::LibXMLã§è§£æãã¦åå¾ãã¦ãã
ãã®å ´åãåå¾ãããå¤ã¯èªåçã«utf8フラグが立ってãã¾ãã
ã¤ã¾ããXML::LibXMLãéãã¦DOM解æãã¦åå¾ãã¦ããURLããURIã¢ã¸ã¥ã¼ã«ã使ã£ã¦query_formã¡ã½ããã§ãã©ã¡ã¼ã¿è¿½å ããããäºæããªãURLã«ãªã£ã¦ãã¾ã£ãã®ã§ãããã調æ»ããã¨ãã®ã¡ã¢ã§ããï¼ããã¾ã§åããªï¼
my %qf = $uri->query_form;
ãã®%qfãDumperãã¦ã¿ãã¨ã以ä¸ã®ããã«ãªã£ã¦ãã¾ããã
### %qf: { ### q => "\x{82}\x{e2}\x{82}\x{e9}\x{95}v\x{82}\x{c5}\x{8a}w\x{82}\x{d4}" ### }
qã®ãã©ã¡ã¼ã¿ã®å¤ã¯ãªãã¨ã
- ShiftJISã®ãã¤ããªã®å¤ãæã¡ã
- utf8ãã©ã°ãç«ã£ã¦ãã
ã¨ããç®ãçããããªç¶æ³ã«ãªã£ã¦ãã¾ããã
ã§ã¯ããã®å¤ãå¼·å¶çã« utf8::downgrade ãã¦ç´ç²ãªShiftJISãã¤ããªã«ãã¦ããããããªããã
utf8::downgrade($qf{q}); $qf{flag} = 'ON'; $uri->query_form( %qf );
ããã§ããã¡ã
### $uri_str: 'http://example.com/?q=%C2%82%C3%A2%C2%82%C3%A9%C2%95v%C2%82%C3%85%C2%8Aw%C2%82%C3%94&flag=ON'
ãµã¨æ°ã¥ãããã§ããã
utf8::downgrade($qf{q}); $uri->query_form( q => $qf{q}, flag => 'ON' );
ããã ã¨ãã¾ãããã
### $uri_str: 'http://example.com/?q=%82%E2%82%E9%95v%82%C5%8Aw%82%D4&flag=ON'
ã§ãããªãæ·±å ããã¦ãã£ã¦åãã£ããã ãã©ãURI::_query ã¨ããå é¨ã¢ã¸ã¥ã¼ã«ã§ã
- query_form ã§ãã¤ããªã®ã¾ã¾ã¯ã¨ãªã¹ããªã³ã°ã«ãã¡ãã
- query ã§ä¸æ°ã«ç½®æ
ãã¦ããã§ããã
_query.pm :
sub query { my $self = shift; $$self =~ m,^([^?\#]*)(?:\?([^\#]*))?(.*)$,s or die; if (@_) { my $q = shift; $$self = $1; if (defined $q) { $q =~ s/([^$URI::uric])/ URI::Escape::escape_char($1)/ego; $$self .= "?$q"; } $$self .= $3; } $2; } sub query_form { my $self = shift; my $old = $self->query; if (@_) { # ç¥... my @query; while (my($key,$vals) = splice(@_, 0, 2)) { $key = '' unless defined $key; $key =~ s/([;\/?:@&=+,\$\[\]%])/ URI::Escape::escape_char($1)/eg; $key =~ s/ /+/g; $vals = [ref($vals) eq "ARRAY" ? @$vals : $vals]; for my $val (@$vals) { $val = '' unless defined $val; $val =~ s/([;\/?:@&=+,\$\[\]%])/ URI::Escape::escape_char($1)/eg; $val =~ s/ /+/g; push(@query, "$key=$val"); } } if (@query) { unless ($delim) { $delim = $1 if $old && $old =~ /([&;])/; $delim ||= $URI::DEFAULT_QUERY_FORM_DELIMITER || "&"; } $self->query(join($delim, @query)); } else { $self->query(undef); } } return if !defined($old) || !length($old) || !defined(wantarray); return unless $old =~ /=/; # not a form map { s/\+/ /g; uri_unescape($_) } map { /=/ ? split(/=/, $_, 2) : ($_ => '')} split(/[&;]/, $old); }
ã¤ã¾ãä½ããããããã¨ããã¨ãæååé£çµãã¦ããã¨ã³ã³ã¼ããã¦ããã§ããã¼ã«utf8ãã©ã°ã¤ãã®æååãå«ã¾ãã¦ããã ãã§ãé£çµå¾ã®æååã¯utf8ãã©ã°ç«ã£ã¡ããã¨ãããã¨ã§ãã
çµå±ãXML::LibXMLããåå¾ããæç¹ã§ãutf8::downgradeãã¦ããã°åé¡ãªãã
my $s = $xml->findvalue('.'); utf8::downgrade($s);
ã»ã»æ°ã¥ãã°ç°¡åãªãã ãã©ãã
ã¡ãªã¿ã«ã
$qf{flag} = 'ON';
ãã¦ãã¨ãããããã¾ãããuse utf8; ãã¦ãã¨ããããããã ãã§åãçç¶ã«é¥ãã¾ãã®ã§ãuse utf8;ãã¦ããªã
{ no utf8; $qf{flag} = 'ON'; }
ã¨ãã¦ãããã¨ãã¡ã§ããã
ãã¼ãperl ã® unicodeå¨ãã¯ãã£ã±ããããããããªãã»ã»ã»ã»
ããèããã¨ããåºæ¬çã« use utf8; ãã¦ãã£ã¦ãå é¨ã§ã¯åºæ¬çã«utf8ãã©ã°ã¤ãæååã¨ãã¦æ±ããã¨ããã«ã¼ã«ã«åç´åããã®ãèããã®ãªã®ãããããªãã§ããã¼ãutf8のpodにも基本的に必要ないのにuse utf8;すんな、って書いてあるããã ãã
追è¨
ãã¯ãã³ã¡ã³ãããã
id:nihenæ°ï¼
utf8::downgradeã¯ãã®å ´åã¯ASCIIã§ãããã¨ãä¿éããã¦ãããã§ããã°åé¡ãªããã ãã©ãæå³çã«ã¯Encode::encode('latin-1', $s)ã¨åãã ããutf8::encodeã®ã»ããããã¨ããããããããã
æ¬æä¸ã«%qfã«ShiftJISã®æååãutf8ãã©ã°ã¤ãã§å ¥ã£ã¦ããã¨ãããããã¨ããã§ãã£ã¡ãã«ãªã£ã¦ãã¾ã£ã¦ãã¾ãããï¼ããã§ã¯utf8::encodeããã¨åãã¦ãã¾ãã®ã§utf8::downgradeã§ãªãã¨ãããªãï¼æå³çã«ã¯ç¢ºãã« utf8::encode ã®æ¹ã妥å½ã§ããããææãããã¨ããããã¾ãã
id:tokuhiromæ°ï¼
å¿ è¦ãªãã®ã«use utf8ãããªãã¨ããè¨è¿°ã¯ã¿ããããªãã®ã ãã©ã
utf8のpodããã
Perl ã« script ã UTF-8 ã§æ¸ããã¦ããã¨ãããã¨ãæããç®ç以å¤ã§ãã®ãã©ã°ããã¤ãã£ã¦ã¯ããã¾ããã
å®éãæ¬å½ã« UTF-8 ã®ã½ã¼ã¹ã³ã¼ããæ¸ããã¨ããã®ã§ãªããªãã use utf8 ãã¹ãã§ã¯ããã¾ããã
ã®ããããèªãã§ããlatin-1以å¤ã®æååãUNICODEã¨ãã¦æ±ãæå³ããªããã°ãuse utf8ãããªãã¨è§£éããã®ã§ãããä½ãåéããã¦ãã¾ã£ã¦ãã¾ãããï¼