A Kotlin multiplatform parser combinator library
Konbini
A Kotlin multiplatform parser combinator library.
Getting started
Adding a Dependency on Konbini
Konbini is hosted on Jitpack, so you need to first att the Jitpack repository to your build file, then add a dependency on Konbini.
For build.gradle.kts
(Kotlin DSL):
repositories {
maven {
url = uri("https://jitpack.io")
}
}
dependencies {
implementation("cc.ekblad.konbini:konbini:0.1.0")
}
A Simple Parser
Konbini parsers are constructed from combinators: tiny functions which glue together other tiny functions into complex parsers.
Consider the following parser which parses and computes expressions containing integer literals and additions:
import cc.ekblad.konbini.Parser
import cc.ekblad.konbini.char
import cc.ekblad.konbini.integer
import cc.ekblad.konbini.oneOf
import cc.ekblad.konbini.parser
import cc.ekblad.konbini.whitespace
val addition: Parser<Long> = parser {
val lhs = expr()
whitespace()
char('+')
whitespace()
val rhs = expr()
return lhs + rhs
}
val expr: Parser<Long> = oneOf(integer, addition)
This example makes use five basic combinators. Three of them are leaf parsers, or atoms:
integer
, which parses any integer that can fits in a 64-bit signedLong
;char
, which parses any one out of zero or more characters passed to it as arguments; andwhitespace
, which parses zero or more whitespace characters.
The remaining two are more interesting:
parser
lets you combine parsers into bigger parsers. To use a parser from within aparser
block, simply call it as a normal function.oneOf
takes zero or more parsers as arguments, and tries them all from left to right until it finds one that succeeds. If none of the given parsers succeed, theoneOf
parser fails. Any parser can be passed as an argument tooneOf
. Unlike many other parser generators and libraries you may be familiar with, such as ANTLR or GNU Bison, Konbini implements arbitrary backtracking.
In addition to these, Konbini defines several other helpful parser combinators to quickly get your parser off the ground.
regex
matches regular expressions.doubleQuotedString
andsingleQuotedString
matches quoted strings, including escape code handling.chainl
andchainr
make it easy to define left-recursive and right-recursive parsers respectively.- Fore more information about the combinators available out of the box, see the API documentation.
Running Your Parser
Each Konbini parser has two extension methods which lets you apply them to arbitrary strings.
parse
applies the parser to a string, reading as much of the string as it can, and returns both the result and any remaining input.parseToEnd
does the same asparse
, but returns an error if the parser did not match the entire input string.
Both methods can be configured to ignore whitespace at the start and end of its input for convenience.
A More Complex Parser
Unlike our simple example parser, most real parsers don’t calculate values on the fly; they build up a syntax tree of some kind. For a more realistic example, the following parser uses mostly built-in functionality to implement a complete JSON parser.
val comma = parser { whitespace() ; char(',') ; whitespace() }
val colon = parser { whitespace() ; char(':') ; whitespace() }
val pKeyValue = parser { doubleQuotedString().also { colon() } to pValue() }
val pAtom = oneOf(decimal, doubleQuotedString, boolean, string("null").map { null })
val pArray = bracket(
parser { char('[') ; whitespace() },
parser { whitespace() ; char(']') },
parser { chain(pValue, comma).terms },
)
val pDict = bracket(
parser { char('{') ; whitespace() },
parser { whitespace() ; char('}') },
parser { chain(pKeyValue, comma).terms.toMap() },
)
val pValue: Parser<Any?> = oneOf(pAtom, pDict, pArray)
This parser is capable of processing around 35 MB of JSON per second on an Apple M2; about the same speed as the parser used for the better-parse benchmark.