Ksoup – Kotlin Multiplatform HTML Parser

Ksoup is a lightweight Kotlin Multiplatform library for parsing HTML, extracting HTML tags, attributes, and text, and encoding and decoding HTML entities.


  • Parse HTML from String
  • Extract HTML tags, attributes, and text
  • Encode and decode HTML entities
  • Lightweight and does not depend on any other library
  • Kotlin Multiplatform support
  • Fast and efficient
  • Unit tested


Add the dependency below to your module‘s build.gradle.kts or build.gradle file:

val version = "0.1.2"

// For parsing HTML

// Only for encoding and decoding HTML entities 


Parsing HTML

To parse HTML from a String, use the KsoupHtmlParser class, and provide an implementation of the KsoupHtmlHandler interface, and a KsoupHtmlOptions object. Both of them are optional, you can use the default ones if you want.


You can create a parser using the KsoupHtmlParser(), there are several methods that you can use, for example write to parse a String, and end to close the parser when you are done:

val ksoupHtmlParser = KsoupHtmlParser()

// String to parse
val html = "<h1>My Heading</h1>"

// Pass the HTML to the parser (It is going to parse the HTML and call the callbacks)

// Close the parser when you are done


You can directly implement KsoupHtmlHandler interface or use KsoupHtmlHandler.Builder():

// Implement `KsoupHtmlHandler` interface
val firstHandler = object : KsoupHtmlHandler {
    override fun onOpenTag(name: String, attributes: Map<String, String>, isImplied: Boolean) {
        println("Open tag: $name")

// Use `KsoupHtmlHandler.Builder()`
val secondHandler = KsoupHtmlHandler
    .onOpenTag { name, attributes, isImplied ->
        println("Open tag: $name")

There are several methods that you can override, for example is you want to just extract the text from the HTML, you can override the onText method:

// String to parse
val html = """
            <title>My Title</title>
            <h1>My Heading</h1>
            <p>My paragraph.</p>

// String to store the extracted text
var string = ""

// Create a handler
val handler = KsoupHtmlHandler
    .onText { text ->
        string += text

// Create a parser
val ksoupHtmlParser = KsoupHtmlParser(
    handler = handler,

// Pass the HTML to the parser (It is going to parse the HTML and call the callbacks)

// Close the parser when you are done

You can also use onOpenTag and onCloseTag to know when a tag is opened or closed, it can be used for scrapping data from a website or powering a rich text editor, Also you can use onComment to know when a comment is found in the HTML and onAttribute to know when attributes are found in a tag.


You can also pass KsoupHtmlOptions to the parser to change the behavior of the parser, you can for example disable the decoding of HTML entities which is enabled by default:

val options = KsoupHtmlOption(
    decodeEntities = false,

Encoding and Decoding HTML Entities

You can use the KsoupEntities class to encode and decode HTML entities:

// Encode HTML entities
val encoded = KsoupEntities.encodeHtml("Hello & World") // return: Hello &amp; World

// Decode HTML entities
val decoded = KsoupEntities.decodeHtml("Hello &amp; World") // return: Hello & World

KsoupEntities also provides methods to encode and decode only XML entities or HTML4. The KsoupEntities class is available in the ksoup-entites module.

Both encodeHtml and decodeHtml methods support all HTML5 entities, XML entities, and HTML4 entities.

Coming Features

  • Add clear documentation
  • Add Markdown parser


