Home
/
Modules
/
yari-parsec
yari-parsec
The parser-combinator engine that underpins the entire Yari framework. Compose typed Parser<T> values to scan characters, tokenize input, build expression parsers with operator precedence and associativity, track exact source locations, and recover from parse errors. All parsers are immutable and composable.
Installation
// Gradle (Groovy DSL)
implementation 'com.easyparsingapi:yari-parsec:VERSION'
// Maven
<dependency>
<groupId> com.easyparsingapi</groupId>
<artifactId> yari-parsec</artifactId>
<version> VERSION</version>
</dependency>
yari-parsec is pulled transitively by every other Yari module, so you rarely need to declare it explicitly.
ApiParser — Entry Points
Static entry points for running a lexer + parser pipeline.
parse(Parser<T>, Parser<List<Token>>, Config, String)
→ T
Lex the string then parse it
parse(Parser<T>, List<Token>)
→ T
Parse a pre-built token list
parse(Parser<T>, List<Token>, Config)
→ T
Parse a pre-built token list with config callback
lexer(Parser<Fragment>, Parser<Void>)
→ Parser<List<Token>>
Build a lexer parser from a tokenizer and whitespace skipper
lex(Parser<Fragment>, Parser<Void>, String)
→ List<Token>
Lex a raw string into a token list
lex(Parser<Fragment>, Parser<Void>, Token)
→ List<Token>
Re-lex the text of an existing token
Parser<T> — Core Combinator
The fundamental building block. Every parser is a pure function from a token stream to a typed result, with backtracking and error recovery.
Sequencing
map(Function<T,R>)
→ Parser<R>
Transform the parsed value
next(Function<T, Parser<R>>)
→ Parser<R>
Monadic bind — chain a second parser that depends on the first result
next(Parser<R>)
→ Parser<R>
Run this parser then the next; keep the second result
followedBy(Parser<?>)
→ Parser<T>
Run this parser then the next; keep the first result
between(Parser<?>, Parser<?>)
→ Parser<T>
Wrap this parser between an open and close parser
between(Supplier, Parser<?>, Parser<?>)
→ Parser<T>
Like between with a custom error supplier for missing close
Alternation
or(Parser<T>)
→ Parser<T>
Try this parser; fall back to the alternative on failure
optional()
→ Parser<T>
Return null if the parser does not match
optional(Supplier<T>)
→ Parser<T>
Return a supplied default value if the parser does not match
asOptional()
→ Parser<Optional<T>>
Return an empty Optional if the parser does not match
Repetition
many()
→ Parser<List<T>>
Zero or more repetitions
many1()
→ Parser<List<T>>
One or more repetitions
sepBy(Parser<?>)
→ Parser<List<T>>
Zero or more separated by a delimiter
sepBy1(Parser<?>)
→ Parser<List<T>>
One or more separated by a delimiter
Filtering
acceptIf(Predicate<T>)
→ Parser<T>
Reject the result if the predicate is not satisfied
label(String)
→ Parser<T>
Attach a human-readable name used in error messages
Lookahead & References
lazy()
→ Parser<T>
Defer reference resolution to first invocation
static newReference()
→ Reference<T>
Create a forward reference for mutually-recursive grammars
Reference.set(Parser<T>)
→ void
Resolve the forward reference to a concrete parser
Reference.lazy()
→ Parser<T>
Return a lazy parser backed by the reference
Parsers — Combinator Factories
sequence(p1, p2, BiFunction)
→ Parser<R>
Sequence two parsers, map their results with a combining function
sequence(parsers…)
→ Parser<List>
Sequence N parsers, collect results into a list
or(parsers…)
→ Parser<T>
Try each parser in order; first match wins
parseIf(Predicate<TokenContext>, Parser)
→ Parser<T>
Lookahead guard — only attempt the parser if the predicate holds
runtime(Supplier<Parser>)
→ Parser<T>
Defer parser construction entirely to runtime
always()
→ Parser<Void>
Always succeeds and consumes nothing
Tokens — Token Model
Tokens.Fragment
tag
Tokens.Tag
The tag assigned to this fragment by the lexer
text
String
The raw matched text
Tokens.Tag (enum)
RESERVED
A keyword or operator matched exactly
IDENTIFIER
An identifier (name, symbol) not in the reserved set
INTEGER
An integer literal
DECIMAL
A decimal literal
COMMENT
A comment fragment
Static factory methods
fragment(String text, Object tag)
→ Fragment
Create a fragment carrying the given text and tag (typically a Tokens.Tag value)
Token (class)
index()
→ int
Start offset in the source character array
length()
→ int
Length in characters
value()
→ Object
The logical token value (a Fragment, String, Long…)
sourceLocator()
→ SourceLocator
The locator used to convert offsets to line/column positions
sourceLocation()
→ SourceLocation
Precise start/end position in the source
toString()
→ String
Human-readable representation of the token
Terminals — Keyword/Operator Builder
Fluent builder for scanners that distinguish keywords, operators, and identifiers.
static operators(String…)
→ Terminals
Start a Terminals describing the operator set (also accepts a Collection<String>)
words(Parser<String>)
→ Builder
Provide the word scanner (what counts as an identifier-shaped token)
keywords(String…)
→ Builder
Register reserved, case-sensitive keywords
caseInsensitiveKeywords(String…)
→ Builder
Register reserved, case-insensitive keywords
build()
→ Terminals
Build the immutable Terminals
tokenizer()
→ Parser<?>
The composed tokenizer parser, ready for use with ApiParser
static identifier()
→ Parser<String>
Token-level parser matching any non-reserved word
Pattern / Patterns — Character-level matchers
Low-level character matchers used to build lexers.
Factory methods
isChar(char)
→ Pattern
Match a single literal character
isChar(CharPredicate)
→ Pattern
Match any character satisfying the predicate
string(String)
→ Pattern
Match an exact string literally
many1(CharPredicate)
→ Pattern
Match one or more characters satisfying the predicate
sequence(patterns…)
→ Pattern
Match each pattern in sequence
or(patterns…)
→ Pattern
Try each pattern in order; first match wins
lineComment(String prefix)
→ Pattern
Match a single-line comment starting with the given prefix
Pattern interface
static MISMATCH = -1
int
Sentinel return value indicating the pattern did not match
static Pattern.rule(Function<Context, Integer>)
→ Pattern
Define a custom pattern from a lambda — return the number of characters matched, or MISMATCH
Source Locations
Every token carries its position; ApiParser maps positions to line/column via SourceLocator.
SourceLocation
SourceLocation(Position start, Position end)
class
A span from start to end in the source; start() / end() → Position
Position
line()
→ int
1-based line number
column()
→ int
1-based column number
SourceLocator
locate(int offset)
→ Position
Convert a character offset to a line/column Position
SourceLocalisable interface
getSourceLocation()
→ SourceLocation
Return the location attached to this node or token
setSourceLocation(SourceLocation)
→ void
Attach a source location (used during parsing)
For the full code-level reference, see the README on GitHub and the Javadoc-annotated source under yari-parsec/src/main/java/.