Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade lexer #2

Merged
merged 4 commits into from
Mar 14, 2020
Merged

Upgrade lexer #2

merged 4 commits into from
Mar 14, 2020

Conversation

eatonphil
Copy link
Owner

The original lexer was written extremely lazily, trying to treat every lexical token the same. Among other issues, it prevented "special characters" like whitespace in strings.

This rewrites the lexer to use a similar pattern as the parser: giving control to helper functions to lex different kinds of tokens, returning a pointer to the next not lex-ed character on success.

The two major features this adds support for is:

  • Support for all allowed characters in strings (including single quotes escaped by a single quote)
  • Support for double quoted identifiers for case preservation, lower-casing un-quoted identifiers

This also improves the accuracy of token location tracking.

Blog post to follow.

@eatonphil eatonphil merged commit d0aec47 into master Mar 14, 2020
@eatonphil eatonphil deleted the pe/upgrade-lexer branch March 14, 2020 23:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant