Kotlin for Data Science

    • Kotlin is concise, readable and easy to learn.
    • Static typing and null safety help create reliable, maintainable code that is easy to troubleshoot.

    Notebooks such as Jupyter Notebook and provide convenient tools for data visualization and exploratory research. Kotlin integrates with these tools to help you explore data, share your findings with colleagues, or build up your data science and machine learning skills.

    The Jupyter Notebook is an open-source web application that allows you to create and share documents (aka “notebooks”) that can contain code, visualizations, and markdown text. Kotlin-jupyter is an open source project that brings Kotlin support to Jupyter Notebook.

    Check out Kotlin kernel’s for installation instructions, documentation, and examples.

    Apache Zeppelin is a popular web-based solution for interactive data analytics. It provides strong support for the Apache Spark cluster computing system, which is particularly useful for data engineering. Starting from version 0.9.0, Apache Zeppelin comes with bundled Kotlin interpreter.

    Kotlin in Zeppelin notebook

    Libraries

    The ecosystem of libraries for data-related tasks created by the Kotlin community is rapidly expanding. Here are some libraries that you may find useful:

    • KotlinDL is a high-level Deep Learning API written in Kotlin and inspired by Keras. It offers simple APIs for training deep learning models from scratch, importing existing Keras models for inference, and leveraging transfer learning for tweaking existing pre-trained models to your tasks.

    • is a library providing extension functions for exploratory and production statistics. It supports basic numeric list/sequence/array functions (from to ), slicing operators (such as , ), binning operations, discrete PDF sampling, naive bayes classifier, clustering, linear regression, and much more.

    • kmath is a library inspired by . This library supports algebraic structures and operations, array-like structures, math expressions, histograms, streaming operations, a wrapper around commons-math and , and more.

    • krangl is a library inspired by R’s and Python’s pandas. This library provides functionality for data manipulation using a functional-style API; it also includes functions for filtering, transforming, aggregating, and reshaping tabular data.

    • is a plotting library for statistical data written in Kotlin. Lets-Plot is multiplatform and can be used not only with JVM, but also with JS and Python.

    • kravis is another library for the visualization of tabular data inspired by R’s .

    Since Kotlin provides first-class interop with Java, you can also use Java libraries for data science in your Kotlin code. Here are some examples of such libraries:

    • DeepLearning4J - a deep learning library for Java

    • - a comprehensive machine learning, natural language processing, linear algebra, graph, interpolation, and visualization system. Besides Java API, Smile also provides a functional Kotlin API along with Scala and Clojure API.

      • - a Kotlin rewrite of the Scala implicits for the natural language processing part of Smile in the format of extension functions and interfaces.
    • Apache Commons Math - a general math, statistics, and machine learning library for Java

    • - a solver utility for optimization planning problems

    • Charts - a scientific JavaFX charting library in development

    • - a natural language processing toolkit

    • Apache Mahout - a distributed framework for regression, clustering and recommendation

    If this list doesn’t cover your needs, you can find more options in the digest from Thomas Nield.