����ɽ���ˤ����Ƥϡ�/a|b|c/(alteration)��[abc](character class)�ˤ��٤����Ȥ����Τϡ�perl�˸¤�ʤ��Q�Ǥ���

qootas.org/blog - perl regex performance
"|"(�ѥ���)��Ȥä�����ɽ���Ϥ���㤯�����٤�����Ȥ�ʤ��褦�ˡ��Ȥ������ȤǤ����Τ��˥٥���ޡ��������32��®���Ǥ���

�ɤ����ʤ�Perl���Ȥ�������/a|b|c/��[abc]�ˤ��Ƥ��줿��Ȼפä����Ȥ���ޤ���?

���ʤ��Ȥ⡢����ɽ����Ż��ǻȤ��ͤϡ������Ǥ�ˮ���Ǥ⤤�����餳�Ρ֥ե������ܡפ�긵�ˤ����Ƥ����٤��Ǥ���ˮ�������夵��Ȥ����Τ⤦�줷���Ȥ����ǡ�����ʤ�¿�����ˮ����������ޤ���

�����Ƨ�ޤ��Ƥ⡢�ʤ�������ɽ����ư��Ŭ���Ȥ�������ޤǤϤ��������ܽ��ޤ������������Ƥ��ʤ������Ǥ˥ե������ܤ��������ޤǤʤ��褦���ɤ���Ȥ����ɼԤ⡢��entry�����ܤǤ���

��̿�������� perl�ä�����ɽ���Ͼ��NFA�ǰ�������ä���
������ä��٤��ʤ�Τä�NFA��ͭ���äǤ���ͤ���

�����NFA��DFA���Ȥ������Ȥ������Ȥ����ǤϤʤ���alteration�Ǥ�/foo|bar|baz/���������ʤ���Фʤ�ʤ��Τǡ�compile���줿�����ɤ��ɤ����Ƥ�ʣ��������Ȥ������Ȥ⤢��ޤ���

/foo|bar|baz/�⡢�褯�����ba[rz]|foo�Ƚ񤭴����뤳�Ȥ�����ޤ������줬����ɽ���κ�Ŭ���ǡ�CPAN�ˤ��ۺ�Regexp::Optimizer�����Regexp::Trie��������Regexp::Assemble�ʤɡ�������äƤ����Module�����Ǥˤ����Ĥ�¸�ߤ��ޤ���ʣ��������ɽ���ϡ����������Ѥ���Ȥ����ΤǤ������⤷Perl������Ū�ˤ�����äƤ��줿�顢��äȤ��꤬�����Ǥ����?

��������ˡ������İ����顣�����ruby��benchmark���Ƥ���Τǡ����Ĥ��perl��benchmark��ή���ȤϤ���äȰ㤤�ޤ����䤬ruby�ˤ���Դ���Ȥ������Ȥ⤢�ä��Τ�ruby������ή���˲�碌�Ƥޤ���

use strict;
use warnings;
use Benchmark ':all';

my $re_alt    = qr/s|t|u|v|w|x|y|z/;
my $re_cclass = qr/[stuvwxyz]/;
my $alphabets = 'abcdefghijklmnopqrstuvwxyz';
my $digits    = '01234567890123456789012345';

my $ntimes = 100_000;

cmpthese(
    timethese(
        0,
        {
            alt    => sub { $alphabets =~ $re_alt    for ( 1 .. $ntimes ) },
            cclass => sub { $alphabets =~ $re_cclass for ( 1 .. $ntimes ) },
        }
    )
);
cmpthese(
    timethese(
        0,
        {
         alt    => sub { $digits =~ $re_alt    for ( 1 .. $ntimes ) },
         cclass => sub { $digits =~ $re_cclass for ( 1 .. $ntimes ) },
        }
    )
);
perl 5.8.8, Mac OS X v10.4, MacBook Pro 2.0GHz �η�̤Ϥ���ʴ�����
% perl alt_vs_cclass.pl 
Benchmark: running alt, cclass for at least 3 CPU seconds...
       alt:  4 wallclock secs ( 3.61 usr +  0.01 sys =  3.62 CPU) @  1.10/s (n=4)
    cclass:  3 wallclock secs ( 3.07 usr +  0.01 sys =  3.08 CPU) @ 12.99/s (n=40)
         Rate    alt cclass
alt    1.10/s     --   -91%
cclass 13.0/s  1075%     --
Benchmark: running alt, cclass for at least 3 CPU seconds...
       alt:  4 wallclock secs ( 3.68 usr +  0.02 sys =  3.70 CPU) @  0.81/s (n=3)
            (warning: too few iterations for a reliable count)
    cclass:  3 wallclock secs ( 3.05 usr +  0.01 sys =  3.06 CPU) @ 13.07/s (n=40)
         s/iter    alt cclass
alt        1.23     --   -94%
cclass 7.65e-02  1512%     --

����ruby��������ruby��benchmark�⥸�塼��äơ������֤����ꤷ�Ƥ����Τ͡�����äȤ��������Ǥ⥳���ɤϤ��줤�ˤ�����͡���äѡ�

require 'benchmark'

re_alt    = Regexp.compile('r/s|t|u|v|w|x|y|z')
re_cclass = Regexp.compile('[stuvwxyz]')
alphabets = 'abcdefghijklmnopqrstuvwxyz'
digits    = '01234567890123456789012345'

ntimes = 100_000;
Benchmark.bm(8) do |x|
    x.report('alt')    { ntimes.times{ re_alt.match(alphabets) } }
    x.report('cclass') { ntimes.times{ re_cclass.match(alphabets) } }
end
Benchmark.bm(8) do |x|
    x.report('alt')    { ntimes.times{ re_alt.match(digits) } }
    x.report('cclass') { ntimes.times{ re_cclass.match(digits) } }
end
ruby��version��1.8.4���¹ԴĶ���perl��Ʊ����
              user     system      total        real
alt       0.260000   0.000000   0.260000 (  0.269878)
cclass    0.200000   0.000000   0.200000 (  0.207589)
              user     system      total        real
alt       0.120000   0.000000   0.120000 (  0.124052)
cclass    0.080000   0.000000   0.080000 (  0.080326)

Ruby�������٤��ȸ������٤��Ǥ�����alteration|��character class�κ��Ͼ��ʤ��Ǥ��͡��ä�match����Ω����ǽ�Υ������Ǥϡ��ܰ��ʤ���

�ȡ������ޤǤ����դꡣ

���߳�ȯ���bleedperl�Ǥϡ��Ĥ���TRIE Optimization���������줿�ΤǤ���

��Ƥߤޤ��礦��

% ~/bleedperl/bin/perl5.9.4 alt_vs_cclass.pl   Benchmark: running alt, cclass for at least 3 CPU seconds...
       alt:  4 wallclock secs ( 3.00 usr +  0.01 sys =  3.01 CPU) @  3.99/s (n=12)
    cclass:  4 wallclock secs ( 3.09 usr +  0.01 sys =  3.10 CPU) @ 11.61/s (n=36)
         Rate    alt cclass
alt    3.99/s     --   -66%
cclass 11.6/s   191%     --
Benchmark: running alt, cclass for at least 3 CPU seconds...
       alt:  3 wallclock secs ( 3.02 usr +  0.01 sys =  3.03 CPU) @  8.25/s (n=25)
    cclass:  3 wallclock secs ( 3.04 usr +  0.01 sys =  3.05 CPU) @ 13.11/s (n=40)
         Rate    alt cclass
alt    8.25/s     --   -37%
cclass 13.1/s    59%     --

�������Ǥ�?

Dan the Regular Expressionist