Note: when samples use a local
input
variable representing the input text to parse, and aparser
variable, the result is usually the outcome of callingvar result = parser.Parse(input)
orvar success = parser.TryParse(input, out var result)
.
These are lowest level elements of a grammar, like a '.'
(dot), predefined strings like "hello"
, numbers, and more.
In Parlot they usually are accessed from the Literals
or Terms
classes, with the difference that Terms
will return a parser that
accepts blank spaces before the element.
Terms and Literals are accessed using the Parsers.Terms
and Parsers.Literals
static classes.
Matches blank spaces, optionally including new lines. Returns a TextSpan
with the matched spaces. This parser is not available in the Terms
static class.
Parser<TextSpan> WhiteSpace(bool includeNewLines = false)
Usage
var input = " \thello world ";
var parser = Literals.WhiteSpace();
Result:
" \t"
Matches any non-blank spaces, optionally including new lines. Returns a TextSpan
with the matched characters.
Parser<TextSpan> NonWhiteSpace(bool includeNewLines = false)
Usage:
var input = "hello world";
var parser = Terms.NonWhiteSpace();
Result:
"hello"
Matches a given string, optionally in a case insensitive way.
Parser<string> Text(string text, bool caseInsensitive = false)
Usage:
var input = "hello world";
var parser = Terms.Text("hello");
Result:
"hello"
Matches a given character.
Parser<char> Char(char c)
Usage:
var input = "hello world";
var parser = Terms.Char("h");
Result:
'h'
Matches an integral numeric value and an optional leading sign.
Parser<long> Integer()
Usage:
var input = "-1234";
Result:
-1234
Matches a numeric value with optional digits and leading sign. The exponent is supported.
Parser<decimal> Decimal()
Usage:
var input = "-1234.56";
var parser = Terms.Decimal(NumberOptions.AllowSign);
Result:
-1234.56
Matches a numeric value of any .NET type. The NumberOptions
enumeration enables to customize how the number is parsed.
The return type can be any numeric .NET type that is compatible with the selected options.
Parser<T> Number() where T : INumber<T>
Usage:
var input = "-1,234.56e1";
var parser = Terms.Number<double>(NumberOptions.Float | NumberOptions.AllowGroupSeparators);
Result:
-12345.6
Matches a quoted string literal, optionally use single or double enclosing quotes.
Parser<TextSpan> String(StringLiteralQuotes quotes = StringLiteralQuotes.SingleOrDouble)
Usage:
var input = "'hello\\nworld'";
var parser = Terms.String();
Result:
'hello\\nworld'
Matches an identifier, optionally with extra allowed characters.
Default start chars are [$_a-zA-Z]
. Other chars also include digits.
Parser<TextSpan> Identifier(Func<char, bool> extraStart = null, Func<char, bool> extraPart = null)
Usage:
var input = "slice_text();";
var parser = Terms.Identifier();
Result:
slice_text
Matches a consecutive characters with a specific predicate, optionally defining a minimum and maximum size.
Parser<TextSpan> Pattern(Func<char, bool> predicate, int minSize = 1, int maxSize = 0)
Usage:
var input = "ababcad";
var parser = Terms.Pattern(c => c == 'a' || c == 'b');
Result:
abab
Matches any chars from a list of chars.
Parser<TextSpan> AnyOf(ReadOnlySpan<char> values, int minSize = 1, int maxSize = 0)
Parser<TextSpan> AnyOf(SearchValue<char> searchValues, int minSize = 1, int maxSize = 0)
Usage:
var input = "ababcad";
var parser = Terms.AnyOf("ab");
Result:
abab
Matches any of two parsers.
Parser<T> Or<T>(this Parser<T> parser, Parser<T> or)
Another overload accepts to return a common base class from the two parsers
Parser<T> Or<A, B, T>(this Parser<A> parser, Parser<B> or)
Usage:
var parser = Terms.Text("one").Or(Terms.Text("1"));
parser.Parse("1");
parser.Parse("one");
parser.Parse("hello")
Result:
"1"
"one"
null
Multiple Or()
calls can be used in a row, e.g., a.Or(b).Or(c).Or(d)
Matches two consecutive parsers. The result is a strongly typed tuple containing the two individual results.
Parser<ValueTuple<T1, T2>> And<T1, T2>(this Parser<T1> parser, Parser<T2> and)
Usage:
var parser = Terms.Text("hello").And(Terms.Text("world"));
parser.Parse("hello world").Item1;
parser.Parse("hello world").Item2;
parser.Parse("hello");
Result:
"hello"
"world"
null
Multiple And()
calls can be used in a row, e.g. to match a variable assignment like age = 12
var input = "age = 12";
var parser = Terms.Identifier().And(Terms.Char('=')).And(Terms.Integer());
var result = parser.Parse(input);
Assert.Equal("age", result.Item1);
Assert.Equal('=', result.Item2);
Assert.Equal(12, result.Item3);
Behaves like And but skips the later one's result.
Parser<T1> AndSkip<T1, T2>(this Parser<T1> parser, Parser<T2> and)```
Usage:
var parser = Terms.Text("hello").AndSkip(Terms.Text("world"));
parser.Parse("hello world");
parser.Parse("hello");
Result:
"hello"
null
This is useful to expect successive terms but to ignore some of them and make the result as lean as possible.
var input = "age = 12";
var parser = Terms.Identifier().AndSkip(Terms.Char('=')).And(Terms.Integer());
var result = parser.Parse(input);
Assert.Equal("age", result.Item1);
Assert.Equal(12, result.Item2);
Behaves like And but skips the former one's result.
Parser<T2> SkipAnd<T1, T2>(this Parser<T1> parser, Parser<T2> and)
Usage:
var parser = Terms.Text("hello").SkipAnd(Terms.Text("world"));
parser.Parse("hello world");
parser.Parse("hello");
Result:
"world"
null
This is useful to expect successive terms but to ignore some of them and make the result as lean as possible.
var input = "age = 12";
var parser = Terms.Identifier().And(Terms.Char('=')).SkipAnd(Terms.Integer());
var result = parser.Parse(input);
Assert.Equal("age", result.Item1);
Assert.Equal(12, result.Item2);
Makes an existing parser optional.
Parser<T> ZeroOrOne<T>(Parser<T> parser)
Usage:
var parser = ZeroOrOne(Terms.Text("hello"));
parser.Parse("hello");
parser.Parse(""); // returns null but with a successful state
Result:
"hello"
null
Executes a parser as long as it's successful. The result is a list of all individual results.
Parser<List<T>> ZeroOrMany<T>(Parser<T> parser)
Usage:
var parser = ZeroOrMany(Terms.Text("hello"));
parser.Parse("hello hello");
parser.Parse("");
Result:
[ "hello", "hello" ]
[]
Executes a parser as long as it's successful, and is successful if at least one occurrence is found. The result is a list of all individual results.
Parser<List<T>> OneOrMany<T>(Parser<T> parser)
Usage:
var parser = OneOrMany(Terms.Text("hello"));
parser.Parse("hello hello");
parser.Parse("");
Result:
[ "hello", "hello" ]
null
Succeeds if the parser is not matching.
Parser<T> Not<T>(Parser<T> parser)
Usage:
var parser = Not(Terms.Text("hello"));
parser.Parse("hello");
parser.Parse("world");
Result:
hello // failure
null // success
Matches all occurrences of a parser that are separated by another one. If a separator is not followed by a value, it is not consumed.
Parser<List<T>> Separated<U, T>(Parser<U> separator, Parser<T> parser)
Usage:
var parser = Separated(Terms.Text(","), Terms.Integer());
parser.Parse("1, 2, 3");
parser.Parse("1,2;3");
Result:
[1, 2, 3]
[1, 2]
Matches a parser when between two other ones.
Parser<T> Between<A, T, B>(Parser<A> before, Parser<T> parser, Parser<B> after)
Usage:
var parser = Between(Terms.Char('['), Terms.Integer(), Terms.Char(']'));
parser.Parse("[ 1 ]");
parser.Parse("[ 1");
Result:
1
0 // failure
Matches a parser after any blank spaces. This parser respects the Scanner
options related to multi-line grammars.
Parser<T> SkipWhiteSpace<T>(Parser<T> parser)
Usage:
var parser = SkipWhiteSpace(Literals.Text("abc"));
parser.Parse("abc");
parser.Parse(" abc");
Result:
"abc"
"abc"
Note: This parser is used by all Terms (e.g., Terms.Text) to skip blank spaces before a Literal.
Creates a parser that can be referenced before it is actually defined. This is used when there is a cyclic dependency between parsers.
Deferred<T> Deferred<T>()
Usage:
var parser = Deferred<string>();
var group = Between(Terms.Char('('), parser, Terms.Char(')'));
parser.Parser = Terms.Integer().Or(group);
parser.Parse("((1))");
parser.Parse("1");
Result:
1
1
Creates a parser that can reference itself.
Deferred<T> Recursive<T>(Func<Deferred<T>, Parser<T>> parser)
Usage:
var number = Terms.Decimal();
var minus = Terms.Char('-');
var parser = Recursive<decimal>((u) =>
minus.And(u)
.Then(static x => 0 - x.Item2)
.Or(number)
);
parser.Parse("--1");
Result:
1
Ignores the individual result of a parser and returns the whole matched string instead.
This can be used for pattern matching when each part of a pattern isn't more useful that the whole result.
Parser<TextSpan> Capture<T>(Parser<T> parser)
Usage:
var parser = Terms.Identifier().And(Terms.Char('=')).And(Terms.Integer());
var capture = Capture(parser);
capture.Parse("age = 12");
Result:
"age = 12"
Convert the result of a parser. This is usually used to create custom data structures when a parser succeeds, or to convert it to another type.
Parser<U> Then<U>(Func<T, U> conversion)
Parser<U> Then<U>(Func<ParseContext, T, U> conversion)
Parser<U> Then<U>(U value)
Usage:
var parser =
Terms.Integer()
.AndSkip(Terms.Char(','))
.And(Terms.Integer())
.Then(x => new Point(x.Item1, y.Item2));
parser.Parse("1,2");
Result:
Point { x: 1, y: 2}
When the previous results or the ParseContext
are not used then the version without delegates can be used:
var parser = OneOf(
Terms.Text("not").Then(UnaryOperator.Not),
Terms.Text("-").Then(UnaryOperator.Negate)
);
Returns a value if the previous parser failed.
Usage:
var parser = Terms.Integer().Else<string>(0).And(Terms.Text("years"));
capture.Parse("years");
capture.Parse("123 years");
Result:
(0, "years")
(123, "years")
Converts the result of a parser, or returns a value if it didn't succeed. This parser always succeeds.
NB: It is implemented using Then()
and Else()
parsers.
Parser<U> ThenElse<U>(Func<T, U> conversion, U elseValue)
Parser<U> ThenElse<U>(Func<ParseContext, T, U> conversion, U elseValue)
Parser<U> ThenElse<U>(U value, U elseValue)
Usage:
var parser =
Terms.Integer().ThenElse<long?>(x => x, null)
parser.Parse("abc");
Result:
(long?)null
When the previous results or the ParseContext
are not used then the version without delegates can be used:
var parser = OneOf(
Terms.Text("not").Then(UnaryOperator.Not),
Terms.Text("-").Then(UnaryOperator.Negate)
);
Fails parsing with a custom error message when the inner parser didn't match.
Parser<T> ElseError(string message)
Usage:
var parser =
Terms.Integer().ElseError("Expected an integer")
.AndSkip(Terms.Char(',').ElseError("Expected a comma"))
.And(Terms.Integer().ElseError("Expected an integer"))
.Then(x => new Point(x.Item1, y.Item2));
parser.Parse("1,");
Result:
failure: "Expected an integer at (1:3)
Fails parsing with a custom error message when the inner parser matched.
Parser<T> Error(string message)
Parser<U> Error<U>(string message)
Usage:
var parser =
Terms.Char('a')
.Or(Terms.Char('b')
.Or(Terms.Char('c').Error("Unexpected char c")));
parser.Parse("c");
Result:
failure: "Unexpected char c"
Adds some additional logic for a parser to succeed.
Parser<T> When(Func<T, bool> predicate)
Returns the next parser based on some custom logic that can't be defined statically. It is typically used in conjunction with a ParseContext
instance
which has custom options.
Parser<U> Switch<U>(Func<ParseContext, T, Parser<U>> action)
Usage:
var parser = Terms.Integer().And(Switch((context, x) =>
{
var customContext = context as CustomParseContext;
return Literals.Char(customContext.IntegerSeparator);
});
For performance reasons it is recommended to return a static Parser instance. Otherwise each Parse
execution will allocate and it will usually be the same objects.
Expects the end of the string.
Parser<T> Eof()
Discards the previous result and replaces it with the default value or a custom one.
Parser<U> Discard<U>()
Parser<U> Discard<U>(U value)
Builds a parser that lists all possible matches to improve performance. Most parsers implement ISeekable
parsers in order to provide OneOf
a way to build a lookup table and identify the potential next parsers in the chain. Some parsers don't implement ISeekable
because they are built too late, like Deferred
. The Lookup
parser circumvents that lack.
Parser<T> Lookup<U>(params ReadOnlySpan<char> expectedChars)
Parser<T> Lookup(params ISeekable[] parsers)
Returns any characters until the specified parser is matched.
Parser<TextSpan> AnyCharBefore<T>(Parser<T> parser, bool canBeEmpty = false, bool failOnEof = false, bool consumeDelimiter = false)
Always returns successfully, with an optional return type or value.
Parser<T> Always<T>()
Parser<object> Always()
Parser<T> Always<T>(T value)
Like Or, with an unlimited list of parsers.
Parser<T> OneOf<T>(params Parser<T>[] parsers)