ã»ã«ããã¹ãã£ã³ã°Cã³ã³ãã¤ã©ãæ¸ãã
ã»ã«ããã¹ã(èªåèªèº«ããã«ã)ã§ããCã³ã³ãã¤ã©noccãæ¸ãã¾ããã
ã¯ããã«
å»å¹´ã®å¤ãããããCã³ã³ãã¤ã©ãæ¸ãã®ãæµè¡ã£ã¦ããã®ã§ãã£ã¦ã¿ã¾ããã
ä¾ã«ãã£ã¦@rui314ããã®8ccã9ccã¨低レイヤを知りたい人のためのCコンパイラ作成入門ãåèã«ãã¦ãã¾ãããããã¯ã¨ã³ãã«ã¯LLVMã使ç¨ãã¦ãã¾ãã
工夫ããç¹
以ä¸ãä½ãä¸ã§å·¥å¤«ããç¹ã§ãã
è¨èªä»æ§ã«å¶éãã¤ãã
Cè¨èªã®å ¨ä»æ§ãç¶²ç¾ ãããã¨ããã¨å°åºå®æã¯ä¸å¯è½ãªã®ã§ãµãã¼ãããè¨èªä»æ§ã«å¶éãã¤ãã¾ããã¾ããã å¶éã«ã¯ä¾ãã°ä»¥ä¸ã®ãããªãã®ãããã¾ãã
- å¤æ°å®£è¨ãåæå¤ãåããªãã
- è¤æ°ã®å¤æ°ãã³ã³ãåºåãã§å®£è¨ã§ããªãã
- typedef宣è¨ãåã®const修飾ãªã©ã¯ãã®èªé ãåºå®ãã¦ããã
- å解ææã¯åã®const修飾ãç¡è¦ããã
- é¢æ°ã®ãããã¿ã¤ã宣è¨ã¯ãããã¬ãã«ã§ã®ã¿è¡ããã
- é¢æ°ãã¤ã³ã¿ã使ããªãã(é¢æ°ãã¤ã³ã¿èªä½ã¯å®è£ ãã¦ãããé¢æ°ãã¤ã³ã¿åã®å¤æ°ã宣è¨ã§ããªã)
- åæä½ã使ããªãã
- etc
ã»ã«ãã³ã³ãã¤ã«ã«ä¸è¦ãããªæ©è½ã¯ããããåãæ¨ã¦ãã¾ãããã®ãããªæ©è½ã¯ã§ããéã使ç¨ããã«ã³ã¼ããæ¸ããã¨ã§æ¯è¼ççæéã§ã»ã«ãã³ã³ãã¤ã«ã«æ¼ãã¤ãããã¨ãã§ãã¾ããã(æåã®ã³ãããã2019å¹´4æ4æ¥ 12:02ã§ã»ã«ãã³ã³ãã¤ã«ã«æåããã®ã2019å¹´4æ14æ¥ 14:30 åå¾ãªã®ã§ã ããã10æ¥éããã)
ãã¹ãé§åçã«éçºãã
noccã®éçºã¯ãã¹ãé§åçã«è¡ãã¾ããã
ãã¹ãã®è¨è¿°ã«ã¯LLVMã®JITã³ã³ãã¤ã«ãé常ã«å½¹ã«ç«ã¡ã¾ããã
åè: nocc/test_engine.c at master · Ryooooooga/nocc · GitHub
æå³è§£æå¨ãæ§æ解æå¨ããåé¢
æ§æ解æå¨ã®ä¸ã«åæ¤æ»ãã·ã³ãã«ã®è§£æ±ºãªã©ã®ã³ã¼ããæ¸ãã¨ãã³ã¼ããè¥å¤§åãããè¦éããæªããªã£ããããã¡ã§ãã
æèèªç±è¨èªã§ããã°ãæ§æ解æãã§ã¼ãºã§(åæ å ±ãªãã®)æ½è±¡æ§ææ¨ãçæãããã®ãã¨ã§æ¨è¨ªåå¨(Visitor)ã使ã£ã¦ãã®æ½è±¡æ§ææ¨ã«åæ å ±ãã·ã³ãã«ã®æ å ±ã追å ããææ³ãåããããã¨ãããã¾ãã
ããããCè¨èªã¯æèä¾åè¨èªã§ããããã(typedef宣è¨ããµãã¼ãããå ´å) ã·ã³ãã«ã®ç»é²ã解決ãæ§æ解æã¨åæã«è¡ãå¿ è¦ãããã¾ãã
ä¾ãã°ãa * b;
ã¨ããCè¨èªã®ã³ã¼ãããã£ãã¨ãã¦ããã®ã³ã¼ããã©ã®ãããªæå³ãæã¤ã®ãã¯ãã以åã®æèã«ä¾åãã¾ãã
{ int a; int b; a * b; /* ããã§ã¯ä¹ç® */ } { typedef int a; a * b; /* ããã§ã¯int*åã®å¤æ°bã®å®£è¨ */ }
ããã§noccã§ã¯swiftcãªã©ãåèã«æå³è§£æå¨ãã¢ã¸ã¥ã¼ã«åããæ§æ解æå¨ããå¼ã³åºãããã³ã¼ã«ããã¯é¢æ°ã¨ãã¦å®è£ ãããã¨ã§æ§æ解æã¨æå³è§£æã®ã³ã¼ããåå²ãããã¨ã«ãã¾ããã
å®éã®ã³ã¼ãã示ããªããã©ã®ããã«åå²ãè¡ã£ããè¿°ã¹ã¾ãã
èå¥åå¼ã®è§£æ
以ä¸ã®é¢æ°ãparse_identifier_expr()
ã¯åä¸ã®èå¥åãããªãå¼(a
, ctx
, printf
ã®ãããª)ã®æ§æ解æãè¡ãé¢æ°ã§ãã
ExprNode *parse_identifier_expr(ParserContext *ctx) { const Token *t; /* ãã¼ã¯ã³ãèªã */ t = consume_token(ctx); /* ãã¼ã¯ã³ãèå¥åã§ãããã¨ããã§ãã¯ãã */ if (t->kind != token_identifier) { fprintf(stderr, "expected identifier, but got %s\n", t->text); exit(1); } /* æå³è§£æã¢ã¸ã¥ã¼ã«ã®å¼ã³åºã */ return sema_identifier_expr(ctx, t); }
parser.c 298è¡ç® ~ 311è¡ç®
è¦ã¦ã®éããå®è£ ã¨ãã¦ã¯ãã¼ã¯ã³ã1ã¤èªãã§èå¥åãã©ããã確ããã¦ããã ãã®åç´ãªãã®ã§ãã
ãã®parse_identifier_expr()
ããå¼ã³åºããã¦ããé¢æ°ãsema_identifier_expr()
ãæå³è§£æã¢ã¸ã¥ã¼ã«ã®ã³ã¼ã«ããã¯é¢æ°ã§ãã
ããã¯1ã¤ã®èå¥åãã¼ã¯ã³ãåãåããã·ã³ãã«ã®è§£æ±ºãåæ å ±ã®è§£æ±ºãè¡ã£ããã¨ãã®å¼ã表ãæ½è±¡æ§ææ¨ã®ãã¼ããè¿ãé¢æ°ã§ãã
ExprNode *sema_identifier_expr(ParserContext *ctx, const Token *t) { /* æ½è±¡æ§ææ¨ã®ãã¼ãã表ãæ§é ä½ */ IdentifierNode *p; /* ãã¼ãã®æ§ç¯ (çç¥) */ /* ç¾å¨ã®ã¹ã³ã¼ãããã·ã³ãã«ã®å®£è¨ãæ¢ã */ p->declaration = scope_stack_find(ctx->env, t->text, true); if (p->declaration == NULL) { /* ã·ã³ãã«ãè¦ã¤ãããªãã£ã */ fprintf(stderr, "undeclared symbol %s\n", t->text); exit(1); } /* (çç¥) */ /* å¼ã®åã¯å®£è¨ãããå¤æ°ã®åã¨åã */ p->type = p->declaration->type; /* æ§ç¯ãããã¼ããæ§æ解æå¨ã«è¿ã */ return (ExprNode *)p; }
[2019/04/15 追è¨]
次ã¯ããè¤éãªæã®è§£æãè¡ãå ´åãè¦ã¦ã¿ã¾ãã
è¤åæã®è§£æ
{ ... }
ã«ãã£ã¦è¡¨ç¾ãããè¤åæ (Compound Statement)ã¯ã¬ãã·ã«ã«ã¹ã³ã¼ããå½¢æãã¾ãã
ãã®æã®æ§æ解æã¯ä»¥ä¸ã®é¢æ°ãparse_compound_stmt()
ã«ãã£ã¦è¡ããã¾ãã
StmtNode *parse_compound_stmt(ParserContext *ctx) { const Token *open; const Token *close; Vec *stmts; /* è¤åæã®å½¢æããã¬ãã·ã«ã«ã¹ã³ã¼ãã«å ¥ã */ sema_compound_stmt_enter(ctx); /* ãã¼ã¯ã³ãèªã */ open = consume_token(ctx); /* ãã¼ã¯ã³ã ã{ã ã§ãããã¨ã確èªãã */ if (open->kind != '{') { fprintf(stderr, "expected {, but got %s\n", open->text); exit(1); } /* å é¨ã®æã®æ½è±¡æ§ææ¨ãã¼ããæ ¼ç´ããvectorãçæãã */ stmts = vec_new(); /* ã}ã ãã¼ã¯ã³ãåºç¾ããã¾ã§ç¹°ãè¿ãæã解æãã */ while (current_token(ctx)->kind != '}') { vec_push(stmts, parse_stmt(ctx)); } /* ãã¼ã¯ã³ãèªã */ close = consume_token(ctx); /* ãã¼ã¯ã³ã ã}ã ã§ãããã¨ã確èªãã */ if (close->kind != '}') { fprintf(stderr, "expected }, but got %s\n", close->text); exit(1); } /* æ½è±¡æ§ææ¨ãã¼ããçæããã¬ãã·ã«ã«ã¹ã³ã¼ãããæãã */ return sema_compound_stmt_leave(ctx, open, (StmtNode **)stmts->data, stmts->size, close); }
parser.c 835è¡ç® ~ 869è¡ç®
ä¸ã®ã³ã¼ãã§ã¯ãèå¥åå¼ã®ä¾ã¨ã¯ç°ãªãæå³è§£æã¢ã¸ã¥ã¼ã«ã2åå¼ã³åºããã¦ãã¾ãã
sema_compound_stmt_enter()
é¢æ°sema_compound_stmt_leave()
é¢æ°
ãããã¯ããããã{ãã®åºç¾æãã}ãã®åºç¾æã«å¼ã³åºãããã³ã¼ã«ããã¯é¢æ°ã§ãããè¤åæãå½¢æããã¬ãã·ã«ã«ã¹ã³ã¼ãã®çæã¨ç ´æ£ãæ å½ãã¾ãã
/* è¤åæã®éå§æã«å¼ã³åºãããã³ã¼ã«ããã¯é¢æ° */ void sema_compound_stmt_enter(ParserContext *ctx) { /* æ°ããã¹ã³ã¼ããã¹ã¿ãã¯ã«ããã·ã¥ãã */ sema_push_scope(ctx); } /* è¤åæã®çµäºæã«å¼ã³åºãããã³ã¼ã«ããã¯é¢æ° */ StmtNode *sema_compound_stmt_leave(ParserContext *ctx, const Token *open, StmtNode **stmts, int num_stmts, const Token *close) { /* æ½è±¡æ§ææ¨ã®ãã¼ãã表ãæ§é ä½ */ CompoundNode *p; /* (çç¥) */ /* ã¹ã¿ãã¯ããã¹ã³ã¼ããåãé¤ã */ sema_pop_scope(ctx); /* ãã¼ãã®æ§ç¯ (çç¥) */ /* æ§ç¯ãããã¼ããæ§æ解æå¨ã«è¿ã */ return (StmtNode *)p; }
[2019/04/15 追è¨ããã¾ã§]
ãã®ããã«æ§æ解æã¨æå³è§£æãåé¢ãã試ã¿ã¯çµæçã«æåããæ§æ解æå¨ (parser.c) 㨠æå³è§£æå¨ (sema.c) åæ¹ã®è¦éãã®è¯ããä¿ã¤ãã¨ãã§ãã¾ããã
ã¾ã¨ã
ã³ã¼ãçæã¯LLVMã«ããã¶ã«ã ã£ãã§ããã¾ãã¡ããã¨æ¸ããæ°ããã¾ããã
æ°ãåãããx86_64ã®ã³ã¼ããåããããã«ãã¦ã¿ããã¨æãã¾ãã
ç¾ç¶ã§ã pic.twitter.com/9lWaWAS8w1
— ããã (@Ryooooooga) 2019å¹´4æ14æ¥
ããã§ã³ã³ãã¤ã©æ¸ãã¾ãè¨ãã
— ããã (@Ryooooooga) 2019å¹´4æ14æ¥
ä»äººã«å ã ã¨ã³ã³ãã¤ã©ãã©ã¹ã¡ã³ãã§ãã
— ããã (@Ryooooooga) 2019å¹´4æ14æ¥