Onigmoã®ã¤ã³ã¿ããªã¿ãdirect threaded codeã«ç½®ãæãã¦CRubyã8%é«éåãã話
Rubyè¨èªã®æ£è¦è¡¨ç¾ã¨ã³ã¸ã³ã¨ãã¦ã使ããã¦ããOnigmo(鬼é²)ãé«éåããã®ã§ãã®è©±ããã¾ãã
Onigmoã§ã¯ãæ£è¦è¡¨ç¾ã®ãããã«ã¯ãã¤ãã³ã¼ãã¤ã³ã¿ããªã¿ãç¨ãã¦NFAã®å®è¡ããã¦ãã¾ãããã¤ãã³ã¼ãã¤ã³ã¿ããªã¿ã®é«éåã«ã¯å¤ãããç¥ããã¦ããææ³ã¨ãã¦ãdirect threaded codeãããããã®ææ³ãç¨ããã°switch-caseãç¨ãã¦å®è£ ãããã¤ã³ã¿ããªã¿ã¨æ¯ã¹ãã¨éæ¥jumpã®é¤å»ãè¡ãããªã©é«éåãæå¾ ã§ãã¾ããå®éãOnigmoã§ãswitch-case ã«ããdispatchãããã®direct threaded codeã«å¤ãããã¨ã§é«éåãã¦ãã¾ãã
... ã¨æã£ããdirect threaded codeã§ã¯ãªããtoken threaded codeã¨ããå®è£ ã«ãªã£ã¦ããã¾ããã(ref: Rename USE_DIRECT_THREADED_VM to USE_TOKEN_THREADED_VM · k-takata/Onigmo@be63582 · GitHub)
token threaded codeã¯switch caseã«æ¯ã¹ãã°ååã«éãã®ã§ããããã¤ãã³ã¼ãå½ä»¤ã®dispatché¨å(ãã¤ãã³ã¼ããèªã¿è¾¼ãã§ã対å¿ããå½ä»¤ã®å®è£ é¨åã®ã¢ãã¬ã¹ãå¼ãã¦ãjumpãã)ã«ã¾ã é«éåã®ä½å°ãããã¾ããdirect threaded codeã§ã¯å½ä»¤dispatchã«ã¯ããã¤ãã³ã¼ãä¸ã«äºåã«æ¸ãè¾¼ãã§ç½®ããã¢ãã¬ã¹ã«jumpããæ¹å¼ãåã£ã¦ããã®ã§ãload 1ååéã訳ã§ããã
// Token threaded code void **labels = {}; uint8_t *opcode = ...; goto labels[*opcode];
// Direct threaded code void **labels = {}; uint8_t *opcode = ...; goto *(void **)opcode;Â
ã¨ãããã¨ã§token threaded codeãdirect threaded codeã«æ´ã«ç½®ãæãããã¨ã§æ´ãªãé«éåãå®ç¾ãã¾ããã
ã¨è¨ãã®ã¯ç°¡åãªã®ã§ãããå®ã¯direct threaded codeã«å¤æ´ããéã«å°ãå°ã£ãåé¡ãããã¾ãããswitch-case dispatchãtoken threaded codeã¨ã¯ç°ãªããdirect threaded codeã§ã¯ãã¤ãã³ã¼ãä¸ã«opcodeã§ã¯ãªãã¢ãã¬ã¹ãæ¸ãè¾¼ãå¿ è¦ãããã¾ããopcodãä¿æãã¦ããé åã«ã¢ãã¬ã¹ãæ¸ãè¾¼ãã°ãããã¨æãããããã¾ããããOnigmoã®opcodeã¯1byteã対ãã¦ã¢ãã¬ã¹ã¯Linux 64bitç°å¢ãªã8byteãªã®ã§ã¨ã¦ã足ããªãã§ãã(switch-caseãtoken threaded codeãçã¡ã¢ãªãç®çã«ãããªãã°å©ç¹ããã£ãã®ã§ãã) çµã¿è¾¼ã¿ã·ã¹ãã ãªã©ã¡ã¢ãªä½¿ç¨éã®å¶ç´ãå¼·ããããªç°å¢ã§ã¯é©å¿ã¯é£ããããããã¾ãããããããã§ãã¡ã¢ãªã使ããç°å¢ã ã£ããç¹ã«åé¡ãªãããããã¾ãããã
Performance comparison of regular expression engines ã使ã£ã¦before/afterãæå ã®M1 macbookã§è¨æ¸¬ããçµæã以ä¸ã®éãã¨ãªãã¾ãã
Pattern | Current | Proposal | Improve Rate |
---|---|---|---|
Twain | 15 ms | 15 ms | 0% |
(?i)Twain | 60 ms | 18 ms | 233% |
[a-z]shing | 14 ms | 13 ms | 7.6% |
Huck[a-zA-Z]+|Saw[a-zA-Z]+ | 20 ms | 16 ms | 25% |
\b\w+nn\b | 394 ms | 414 ms | -5.1% |
[a-q][^u-z]{13}x | 20 ms | 7 ms | 185% |
Tom|Sawyer|Huckleberry|Finn | 25 ms | 19 ms | 31% |
(?i)Tom|Sawyer|Huckleberry|Finn | 236 ms | 179 ms | 34% |
.{0,2}(Tom|Sawyer|Huckleberry|Finn) | 48 ms | 38 ms | 26% |
.{2,4}(Tom|Sawyer|Huckleberry|Finn) | 46 ms | 44 ms | 4.5% |
Tom.{10,25}river|river.{10,25}Tom | 42 ms | 41 ms | 2.4% |
[a-zA-Z]+ing | 509 ms | 410 ms | 24% |
\s[a-zA-Z]{0,12}ing\s | 48 ms | 44 ms | 9% |
([A-Za-z]awyer|[A-Za-z]inn)\s | 101 ms | 99 ms | 2% |
["'][^"']{0,30}[?!.]["'] | 44 ms | 39 ms | 12% |
ãã£ãããªã®ã§ãCRubyã«é©å¿ãããã©ã®ãããéããªããã確èªãã¦ããã¾ãããã ãæ軽ãªbenchmarkã¢ããªã±ã¼ã·ã§ã³ããªãã£ãã®ã§ãComputer Language Benchmarks Gameã®regex-reduxã使ãã¾ããã benchmarksgame-team.pages.debian.net
Before:
$ time time ./ruby -W0 regex-redux.rb 0 < a.txt agggtaaa|tttaccct 356 [cgt]gggtaaa|tttaccc[acg] 1250 a[act]ggtaaa|tttacc[agt]t 4252 ag[act]gtaaa|tttac[agt]ct 2894 agg[act]taaa|ttta[agt]cct 5435 aggg[acg]aaa|ttt[cgt]ccct 1537 agggt[cgt]aa|tt[acg]accct 1431 agggta[cgt]a|t[acg]taccct 1608 agggtaa[cgt]|[acg]ttaccct 2178 50833411 50000000 27388361 4.47 real 13.75 user 0.51 sys time ./ruby -W0 regex-redux.rb 0 < a.txt 13.76s user 0.52s system 318% cpu 4.475 total
After:
$ time time ./ruby -W0 regex-redux.rb 0 < a.txt agggtaaa|tttaccct 356 [cgt]gggtaaa|tttaccc[acg] 1250 a[act]ggtaaa|tttacc[agt]t 4252 ag[act]gtaaa|tttac[agt]ct 2894 agg[act]taaa|ttta[agt]cct 5435 aggg[acg]aaa|ttt[cgt]ccct 1537 agggt[cgt]aa|tt[acg]accct 1431 agggta[cgt]a|t[acg]taccct 1608 agggtaa[cgt]|[acg]ttaccct 2178 50833411 50000000 27388361 4.26 real 12.62 user 0.48 sys time ./ruby -W0 regex-redux.rb 0 < a.txt 12.62s user 0.48s system 306% cpu 4.270 total
å®è¡ç°å¢ã¯åããM1 macã§ããuser ã®æéã13.75sec -> 12.62secã¨8.9%ãããã®é«éåãéæã§ããããã§ããæªãã¯ãªãããã§ãã Onigmoã®pull requestãã¾ã ææã¡ã®macã§éããããã¿ã¼ã³ã®æ£è¦è¡¨ç¾ãããã¹ããã¦ããªãã®ã§ãã¼ã¸ããããã¯åããã¾ããããè¿ããã¡ã«ãã¼ã¸ãããã¨å¬ããã§ããã