ååã®ãããã
Ruby Parser開発日誌 (9) - RubyKaigi 2023で発表してきた ~ 世はまさに”大パーサー時代” ~ - かねこにっき
RubyKaigiã«ãã£ã¦ãã¾ãããã¹ã©ã¤ããç»å£æã®åç»ã¯ä»¥ä¸ã®ãªã³ã¯ããåç §ã§ãã¾ãããã²ã覧ãã ããã
parse.y ãªãã¡ã¯ã¿ãªã³ã°ãã£ã¬ã³ã¸
"parse.y"ãèªãã§ããæã«rb_obj_hide
ãrb_builtin_class_name
ãRB_OBJ_WRITTEN
ã¨ãã£ã颿°ããã¯ããåºã¦ãã¦é©ãããã¨ã®ãã人ã¯å°ãªããªãã§ããããRubyã®parserã§ã¯GCãRubyã®Objectãªã©Rubyæ¬ä½ã®æ©è½ãå¤ã使ããã¦ãã¾ãããããã¯Rubyã®é·ãæ´å²ã®ãªãã§æ´ç·´ããã¦ããæ©è½ã§ãã䏿¹ãä¸å®ã®å¾¡ä½æ³ã«å¾ã£ã¦ä½¿ãå¿
è¦ãããã¾ãããã¨ãã°RB_OBJ_WRITE
ãRB_OBJ_WRITTEN
ãæ¸ãå¿ããçµæãæãã¬ã¨ããã§GCã«ãã£ã¦å¿
è¦ãªãªãã¸ã§ã¯ããååããã¦ãã¾ã£ãã¨ãããã°ã¯èª°ããä¸åº¦ã¯è¸ããã®ã§ãããã
ã¾ãRubyã®AST Node(struct RNode
)ã¯å¤æ§ãªNodeãunionã§ç®¡çãã¦ãããããã©ã®Nodeã®ã©ã®ãã£ã¼ã«ããã©ã®åã§ãããããããªãã¦éæ¹ã«ããããã¨ãããã¨æãã¾ãã
ruby/rubyã«ããã¤ãã®patchãããã¦æºåãæ´ã£ãã®ã§ããã®ãããªç¶æ³ãæ¹åããããã«"parse.y"ãæ¬æ ¼çã«ãªãã¡ã¯ã¿ãªã³ã°ãã¦ããã¾ãã
ãªãã¡ã¯ã¿ãªã³ã°ã®ã¢ã¤ãã¢
ãªãã©ã«ãªãã¸ã§ã¯ããRubyã®ãªãã¸ã§ã¯ããã忥ããã
1
ã"abc"
ã¨ãã£ããªãã©ã«ã¯parserã®ãªãã§Rubyã®ãªãã¸ã§ã¯ãã«å¤æããã¦ãã¾ãããã®ã¾ã¾ã§ã¯Cè¨èªã®intãRubyã®Integer(Fixnum)ã«å¤æãã颿°ãªã©ãå¿
è¦ã«ãªãã¾ãããRubyã®ãªãã¸ã§ã¯ããæ±ã£ã¦ããéãGCãããé¢ãããã¾ããããªãã©ã«ãªãã¸ã§ã¯ãã«ã¤ãã¦ã¯"123"
ã¨ããStringã¨ãã¦Nodeã«æãããããã«ãã"parse.y"ã®å¤å´(ãã¨ãã°"compile.c")ã§Rubyã®ãªãã¸ã§ã¯ãã«å¤æããããã«æ¸ãæããå¿
è¦ãããã¾ãã
ããã
# +- nd_body: # @ NODE_LIT (id: 0, line: 1, location: (1,0)-(1,3))* # +- nd_lit: 123 (ããã¯CRubyã®Integer object)
ãããã
# +- nd_body: # @ NODE_LIT (id: 0, line: 1, location: (1,0)-(1,3))* # +- nd_lit: "123" (char *) # len: 3 # type: Integer
Fixnum, Bignum, Float, Rational, Complex, Stringãªã©ã対象ã§ãã
ã¡ãªã¿ã«"123"
ã¨ããStringã¨æ¸ãã¾ãããããããRubyã®Stringã使ããã«parserç¨ã«èªåã®Stringæ§é ä½ã¨é¢é£ãã颿°ãç¨æãã¾ã1ã
Nodeã1ã¤ã®unionããå¥ã ã®æ§é ä½ã«ãã
åé ã«ãæ¸ããããã«Rubyã®AST Node(struct RNode
)ã¯å¤æ§ãªNodeãunionã§ç®¡çãã¦ãã¾ãããã®ããNodeã®ç¨®é¡ã¨åãã£ã¼ã«ã(u1, u2, u3)ã®åã®å¯¾å¿ãçè§£ããã®ãé£ããã§ãã
ãã¨ãã°ã¡ã½ããå¼ã³åºãã®NODE_CALL
ã®å ´åãã¬ã·ã¼ãã¼ãu1
ã«RNode
ã¨ãã¦ãã¡ã½ããã®idãu2
ã«ID
ã¨ãã¦ã弿°ãu3
ã«RNode
ã¨ãã¦ããããå
¥ã£ã¦ãã¾ãã
ãã®å¯¾å¿ã調ã¹ãã¨ãã«node_dump颿°ããnode_children颿°ãè¦ã¦ãã人ãå¤ãã®ã§ã¯ãªãã§ããããã
ãã®å®è£ ãããããNodeã®ç¨®é¡ãã¨ã«æ§é ä½ãå®ç¾©ãã¦æ´çãã¦ããã¾ãã
// ç¾å¨ã®å®è£ ã§ã¯3ã¤ã®unionãããããã®æ§é ä½ã ããã¿ã¦ãå®éã®ãã£ã¼ã«ãã®åãå¤å¥ããã®ãé£ãã typedef struct RNode { VALUE flags; union { struct RNode *node; ID id; VALUE value; rb_ast_id_table_t *tbl; } u1; union { struct RNode *node; ID id; long argc; VALUE value; } u2; union { struct RNode *node; ID id; long state; struct rb_args_info *args; struct rb_ary_pattern_info *apinfo; struct rb_fnd_pattern_info *fpinfo; VALUE value; } u3; rb_code_location_t nd_loc; int node_id; } NODE; ... #define nd_recv u1.node #define nd_mid u2.id #define nd_args u3.node ...
parserã¬ãã«ã®æé©åã¨ASTã夿ãããã¸ãã¯ãåé¤ãã
ããããCRubyã®è©ä¾¡æ©ãVMã«ãªãåã®åæ®ã ã¨æãã¾ãããparserã®å é¨ã§ç´°ããæé©åãè¡ã£ã¦ãã¾ãã
ãã¨ãã°ãªãã©ã«ã ããæ¸ãããè¡ãéä¸ã«ãã£ãã¨ãããã®è¡ã¯ããã°ã©ã ã®å®è¡çµæã«ä½ãå½±é¿ãã¾ããããã®ãããªè¡ã¯parserã®ã¬ãã«ã§æé©åããASTã«ã¯æ®ãã¾ãã(block_append
颿°ãåç
§)ã
def m :a :b end
# :aãã©ãã«ããªã # @ NODE_SCOPE (id: 6, line: 1, location: (1,0)-(4,3)) # +- nd_tbl: (empty) # +- nd_args: # | (null node) # +- nd_body: # @ NODE_DEFN (id: 1, line: 1, location: (1,0)-(4,3))* # +- nd_mid: :m # +- nd_defn: # @ NODE_SCOPE (id: 5, line: 4, location: (1,0)-(4,3)) # +- nd_tbl: (empty) # +- nd_args: # | @ NODE_ARGS (id: 2, line: 1, location: (1,5)-(1,5)) # | +- nd_ainfo->pre_args_num: 0 # | +- nd_ainfo->pre_init: # | | (null node) # | +- nd_ainfo->post_args_num: 0 # | +- nd_ainfo->post_init: # | | (null node) # | +- nd_ainfo->first_post_arg: (null) # | +- nd_ainfo->rest_arg: (null) # | +- nd_ainfo->block_arg: (null) # | +- nd_ainfo->opt_args: # | | (null node) # | +- nd_ainfo->kw_args: # | | (null node) # | +- nd_ainfo->kw_rest_arg: # | (null node) # +- nd_body: # @ NODE_LIT (id: 4, line: 3, location: (3,2)-(3,4))* # +- nd_lit: :b
é£ç¶ããstringãªãã©ã«ã¯é£çµãã1ã¤ã®stringã«ãªãã¾ããããããã¾ãparserã¬ãã«ã®æé©åã§çµåãè¡ããã¦ãã¾ã(literal_concat
颿°ãåç
§)ã
"a" "b"
# "a" "b" ã§ã¯ãªã"ab"ã«ãªã£ã¦ãã # @ NODE_SCOPE (id: 2, line: 1, location: (1,0)-(1,7)) # +- nd_tbl: (empty) # +- nd_args: # | (null node) # +- nd_body: # @ NODE_STR (id: 0, line: 1, location: (1,0)-(1,3))* # +- nd_lit: "ab"
ã¾ãshareable_constant_value
ãã¸ãã¯ã³ã¡ã³ãã¯ASTã®Nodeã追å ãããã¨ã§å®è£
ããã¦ãã¾ã(nd_mid: :make_shareable
ã®ããã)ããã®æã®å®è£
ã¯"parse.y"ã®å¤å´ããã¨ãã°"compile.c"ã¸ãã£ã¦ããã¾ãã
# shareable_constant_value: experimental_everything FOO = Set[1, 2, {foo: []}]
$ ruby -v --dump=p test.rb ruby 3.2.0 (2022-12-25 revision a528908271) [arm64-darwin21] ... # +- nd_body: # @ NODE_CDECL (id: 0, line: 2, location: (2,0)-(2,26))* # +- nd_vid: :FOO # +- nd_else: not used # +- nd_value: # @ NODE_CALL (id: 15, line: 2, location: (2,0)-(2,26)) # +- nd_mid: :make_shareable # +- nd_recv: # | @ NODE_LIT (id: 13, line: 2, location: (2,0)-(2,26)) # | +- nd_lit: #<Class:0x000000010322cd20> # +- nd_args: # @ NODE_LIST (id: 14, line: 2, location: (2,0)-(2,26)) ...
ãªãã¡ã¯ã¿ãªã³ã°ã®å ã«ãããã®
ãããã®ãªãã¡ã¯ã¿ãªã³ã°ãå®äºããã¨ãã大ãã3ã¤ã®æ©æµãå¾ããã¨ãã§ããã¨èãã¦ãã¾ãã
1. ç´ ç´ãªASTãæã«å ¥ã
Abstract Syntax Tree(æ½è±¡æ§ææ¨)ã¯ãã®åã®ã¨ããæ½è±¡åããããã®ã§ãããããå種解æãã¼ã«ã®ãã¨ãè¸ã¾ããæ¨ä»ã§ã¯ããæ å ±ãä»ä¸ããæ¹åã«ASTãæ´çãããã¨ãæ±ãããã¦ããããã«æãã¾ããå ¥åãããã³ã¼ããç´ ç´ã«æ§é åãããã¼ã¿ã¨ãã¦ã®ASTãæã«å ¥ããã¨ã¯ãå種ãã¼ã«ã®å®è£ è ã«å¿ è¦ãªæ å ±ãæä¾ãããã¨ã«ç¹ããã¾ãã
2. parse.yã®å¯èªæ§ãä¸ãã
以å Ruby Parser開発日誌 (6) - parse.yのMaintainabilityの話 - かねこにっき ã§ã¯parserã¨lexerã®å¯çµåã¨ãã観ç¹ãã"parse.y"ã®Maintainabilityã®è©±ããã¾ããããã®å¯çµåã®è§£æ¶ä»¥å¤ã«ãNodeã®æ§é ãæ´çããããparserã«ãããæé©å(Nodeã®å¤å½¢)ãåé¤ãããã¨ãMaintainabilityãå¯èªæ§ã®åä¸ã«ã¤ãªããã¾ãã
3. Universal Parser ã¸ã®è²¢ç®
Universal Parserã¨ã¯TypeProf, Sorbetã¨ãã£ãè§£æãã¼ã«ããmrubyãªã©CRuby以å¤ã®Rubyå®è£ ã§ãCRubyã®parserã使ããããã«ããã¨ããããã¸ã§ã¯ãã§ã2ããã®ããã«ã¯ãã¾ã®parserããCRubyã®æ©è½ã¸ã®ä¾åãæ¶ãã¦ããå¿ è¦ãããã¾ãã"parse.y"ãRubyã®Objectã«ä¾åããªããªã£ãã¨ãUniversal Parserãæã«å ¥ãã¾ãã
ç®æ¨ã¨ãã¦ã¯ä»å¹´ã®Rubyã®ãªãªã¼ã¹(2023/12/25)ã¾ã§ã«CRubyã®é¢æ°ã¸ã®ä¾åãå ¨é¨æ¶ãããã¨æã£ã¦ãã¾ã3ã
ã¾ã¨ã
ããã¾ã§ã¿ã¦ããããã«ãã®ãã£ã¬ã³ã¸ã«ã¯
- Rubyã®Stringã®ãµãã»ãããå®è£ ãã
- ASTã®æ§é ãçµã¿ç«ã¦æ¹ã夿´ãã
- "parse.y"ã®å¤æ´ã«ãããã¦"compile.c"ã夿´ãã
- parserã®å é¨ãã¼ã¿æ§é ãGCãã忥ããã(ã¡ã¢ãªç®¡çã®å¤æ´)
ãªã©ã®å¹ åºãã¸ã£ã³ã«ã®èª²é¡ãããã¾ãããããã¯parserã®æ§æè¦åããããã¨ããããã¯ãparserã®æ±ããã¼ã¿æ§é ããã¸ãã¯ã¨ãã£ãCè¨èªã§æ¸ãããé¨åã®ãªãã¡ã¯ã¿ãªã³ã°ãä¸å¿ã§ããã¨ãããã¨ã¯parser generatorã«è©³ãããªãã¦ããæ§æè¦åã夿´ããªãã¦ãã"parse.y"ã«æãå ¥ãããã£ã³ã¹ãåºãã£ã¦ããã¨ãããã¨ã§ããã¾ãASTããããã¨çµæ§ãªç¢ºçã§"compile.c"ã調æ´ãããã¨ã«ãªãã¯ããªã®ã§ã"compile.c"ã«ã¤ãã¦ã詳ãããªããã¯ãã§ãã
"parse.y"ãå®è³ªæ¸ãç´ãã¨ãã£ã¦ãéè¨ã§ã¯ãªãããã¸ã§ã¯ãã«ããªããåå ãã¦ã¿ã¾ããã?
ææ¦è æ±ã!!
ããããå ¨é¨ä¸äººã§é²ããã®ã¯ãããã¨ããããããã£ã¦å¤§å¤ã§ããããèå³ããæã¡ã®æ¹ããããããã²æä¼ã£ã¦ããã ãããã§ããruby-jp.slack.com | ruby-jpã«#lr-parser ã¨ãããã£ã³ãã«ãããã¾ãã®ã§ã詳ãã話ãèãããã¨ããæ¹ã¯ãè¶ããã ããã
- ã¨ãããã¨ã¯Encodingã¯? ã¨æã£ã人ã¯ããã©ãã§ãããStringãªãã©ã«ãç§»æ¤ããããã«ã¯ãã©ã®Encodingãæå®ããã¦ãããã¨ããæ å ±ããããã¦CRubyã®Encodingã¨å¯¾å¿ã¥ããå¿ è¦ãããã¾ãã↩
- Universal Parserã«ã¤ãã¦è©³ããã¯Ruby Parser開発日誌 (8) - Universal Parserへの道 - かねこにっきãã¿ã¦ãã ããã↩
-
æçµçã«æ®ã
malloc
ãfree
ã¯optionalã«æ¸¡ããã¨ãã§ããç¶æ ã«ãã¾ãããªã®ã§ãã®ç®æ¨ãéæããã°å¤ãããªã«ã渡ããªãã¦ãparserãã¤ãã£ã¦parseãã¦AST Nodeãæã«å ¥ãããã«ãªãã¾ãã↩