datahike - lightweight datalog

Konrad Kühne

Clojure Meetup Berlin 2018/9/12

Created: 2018-09-12 Wed 21:42

About

konrad.jpg

  • Clojure enthusiast since 2012
  • Co-author of replikativ libraries
  • CEO and Co-Founder of lambdaforge
  • Clojure consulting since 2016

Overview

  • Motivation
  • Datalog
  • Datahike
  • Live-Coding

Motivation

  • replikativ: replication system based on CRDTs written in Clojure
  • mature building blocks: persistence, communication, cryptography
  • goal: opensource, x-runtime, functional, distributed database with query engine and CRDTs

Datalog

  • declarative logic programming language
  • subset of Prolog
  • relational Algebra
  • similar to SQL
[:find ?e 
 :where 
 [?e :name "konrad"]]

Datahike

  • combination of datascript and hitchhiker tree
  • durable, immutable triplestore
  • supports most datomic API calls

DataScript

  • in-memory datalog implementation for Clojure/Script
  • mature (4 years+ development)
  • faster in memory query engine than Datomic
  • supports a lot of Datomic's API:
    • pull-expressions
    • transactor functions
  • but allows partial schemas

Hitchhiker Tree

  • fractal tree by David Greenberg
  • combination B+ tree and append-only log
  • good read and write performance
  • convenience of functional, persistent datastructure
  • unfortunately no scientific research

Performance

Average execution time for basic datalog queries

perf.png

Roadmap and further Ideas

  • Clojurescript Port (in progress)
  • Single transactor replication with dat (in progress)
  • datalog on blockchain with datahike underneath (working prototype)
  • CRDT mapping to indices
  • datalog CRDT integration
  • query caching strategies
  • full-text index via Lucene

Live-Coding

References