Distributed Key-Value Store Challenge#

Welcome to the distributed key-value store challenge!

You’ll build a distributed key-value store from scratch using Raft consensus, the same algorithm that powers etcd and Consul. Start with a single-node system that handles persistence and crash recovery, then implement leader election, log replication, and fault tolerance.

This is the first project in lsfr’s distributed systems series. It teaches consensus-based replication; later projects will teach different patterns like leaderless replication, CRDTs, and Byzantine fault tolerance.

Stages#

  1. HTTP API

Build a basic in-memory key-value store with GET/PUT/DELETE operations over HTTP.

  1. Persistence

Add persistence to your store. Data should survive clean shutdowns (SIGTERM).

  1. Crash Recovery

Ensure data consistency after crashes. Data should survive unclean shutdowns (SIGKILL).

  1. Leader Election

Form a cluster and elect a leader using the Raft consensus algorithm.

  1. Log Replication

Replicate operations from the leader to followers with strong consistency guarantees.

  1. Membership Changes

Dynamically add and remove nodes from the cluster without downtime.

  1. Fault Tolerance

Handle node failures and network partitions while maintaining safety guarantees.

  1. Log Compaction

Prevent unbounded log growth through snapshots and log truncation.

Getting Started#

If you haven’t already, read this overview on how lsfr works and then start with stage 1 (HTTP API).

Resources#

Books#

Papers#

Videos#

Implementations#

Discuss on GitHub →