Every parser exposes the same shape: call parseUnit(…) and get back an
AstResult<T>. It bundles the root AST node — unit() — with the
full token stream and source-navigation helpers: getTokens(),
getTokensOf(node), substring(sourceLocation), getSource(),
and offset↔position conversions via getPosition(offset) / getIndex(position).
Here is one complete example per parser.
String source = """
<catalog>
<book id="1">Le Petit Prince</book>
<book id="2">1984</book>
</catalog>
""";
Xml xml = XmlParser.parseUnit(source, XmlConfig.builder().build()).unit();
// Print the name of every tag
xml.astStream()
.filter(n -> n instanceof Tag)
.map(n -> (Tag) n)
.forEach(tag -> System.out.println(tag.getHead().getName()));
String source = """
.btn { color: red; font-weight: bold; }
.btn:hover { background: #fee2e2; }
""";
Css css = CssParser.parseUnit(source).unit();
// Print the selector of every rule set
css.astStream()
.filter(n -> n instanceof RuleSet)
.map(n -> (RuleSet) n)
.forEach(rule -> System.out.println(rule.getSelector()));
String source = """
function greet(name) {
return "Hello " + name;
}
""";
Javascript js = JavascriptParser.parseUnit(source).unit();
// Print the name of every top-level function
js.astStream()
.filter(n -> n instanceof FunctionDeclaration)
.map(n -> (FunctionDeclaration) n)
.forEach(fn -> System.out.println(fn.getName()));
String source = """
<html>
<body>
<style>.btn { color: red; }</style>
<script>const x = 42;</script>
<button class="btn">Go</button>
</body>
</html>
""";
Html html = HtmlParser.parseUnit(source, HtmlConfig.defaulConfig()).unit();
// Walk the unified tree — HTML, CSS and JS in one pass
html.walkChildren(handler -> {
switch (handler.node()) {
case ScriptTag script ->
System.out.println("JS statements: "
+ script.getBody().getNodes().size());
case StyleTag style ->
System.out.println("CSS rules: "
+ style.getBody().getNodes().size());
default -> {}
}
});
Need to parse another language? yari-parsec is the parser-combinator engine
the whole framework is built on — use it to write your own parser for any language or DSL.
The example below is taken from the yari-blueprint-parsec starter project, which
parses a tiny C-like language. It highlights the two key tools: OperatorTable,
which builds expressions with operator precedence, and error recovery, which turns broken
input into ErrorNodes inside the tree instead of throwing an exception.
// Tokenizer: operators, keywords, identifiers and integer literals
Terminals terminals = Terminals.operators("=", "{", "}", "(", ")", ";", ",", "-", "+", "*", "/")
.words(Scanners.IDENTIFIER)
.keywords("function", "String", "Integer")
.build();
Parser<Void> delimiter = Parsers.or(Scanners.lineComment("//"), Scanners.WHITESPACES).skipMany();
Parser<?> tokenizer = Parsers.or(Scanners.INTEGER.map(Tokens::integerLiteral), terminals.tokenizer());
Parser<IntegerNode> integer = Terminals.IntegerLiteral.PARSER.map(Integer::parseInt).map(IntegerNode::new);
Parser<IdentifierNode> identifier = Terminals.identifier().map(IdentifierNode::new);
// ── Expressions: precedence via OperatorTable, recursion via Parser.Reference ──
Parser.Reference<Node> reference = Parser.newReference();
Parser<Node> operand = Parsers.or(integer, identifier,
reference.lazy().between(terminals.token("("), terminals.token(")")));
Parser<Node> operation = new OperatorTable<Node>()
.prefix(terminals.token("-").map(Token::toString).map(OperatorNode::new)
.map(op -> MapOperator.map(op, x -> new PrefixNode(op, x))), 30)
.infixl(Parsers.or(terminals.token("*"), terminals.token("/")).map(Token::toString).map(OperatorNode::new)
.map(op -> MapInfix.map(op, (l, r) -> new InfixNode(l, op, r))), 20)
.infixl(Parsers.or(terminals.token("+"), terminals.token("-")).map(Token::toString).map(OperatorNode::new)
.map(op -> MapInfix.map(op, (l, r) -> new InfixNode(l, op, r))), 10)
.buildMap(operand);
reference.set(operation);
// A declaration: type identifier '=' operation (e.g. Integer x = a + 2 * b)
Parser<DeclarationNode> declaration = Parsers.sequence(
Parsers.or(terminals.token("String"), terminals.token("Integer")).map(Object::toString).map(IdentifierNode::new),
identifier,
terminals.token("=").next(operation),
DeclarationNode::new);
// ── Error recovery: a broken declaration in a { } body becomes an ErrorNode ──
// manyBetween repeats `declaration` inside { }, resyncing on ';' instead of throwing.
Parser<Node> body = declaration.followedBy(terminals.token(";").optional()).<Node>cast()
.manyBetween((detail, location, tokens) -> new ErrorNode(detail.getFailureMessage(), tokens, location),
terminals.token(";"), // retry from the next ';'
Parsers.never(), // never force-stop
Parsers.never(), // never hard-fail
terminals.token("{"), // opening boundary
terminals.token("}")) // closing boundary
.map(BlockNode::new);
// `function` and `unitParser` wire these together — see yari-blueprint-parsec for the full grammar.
// The parse always returns a walkable AST — mistakes are ErrorNodes, never exceptions.
List<Node> program = ApiParser.parse(unitParser, tokenizer, delimiter, source);
program.forEach(node -> node.astStream().forEach(System.out::println));