Most SQLite scaling mistakes are not SQL mistakes. They are write-authority mistakes. Teams say they want "replicated SQLite," but that phrase can mean four materially different things: a single writable database with continuous off-site recovery, a leased single primary with local read copies, a network database whose leader applies SQL through Raft, or an embedded consensus layer inside the application cluster. Those are not interchangeable designs, and the fastest way to sort them is to ask a blunt question: when two machines disagree, which layer gets to decide what the truth is?[1][3][4][5][6]

SQLite itself sets the boundary. Its WAL design allows readers and one writer to coexist efficiently, but the WAL mechanism relies on shared memory on the same host, which is exactly why pushing SQLite beyond one machine always requires a second system around it.[1][7] From there the ecosystem splits into lanes:

If you decide on that basis, the comparison gets cleaner than any benchmark chart.

1. Litestream: keep the file local, make recovery continuous

Litestream is the narrowest and, for many teams, the most honest extension of SQLite. Its own docs describe it as a separate background process that continuously copies WAL pages from disk to one or more replicas.[4] The application still talks to an ordinary local SQLite file. The write path does not move into a cluster coordinator or remote leader. Authority remains where SQLite already expects it to be: inside the one running process and the one writable database file.[1][4]

That is why Litestream is best read as a disaster-recovery lane, not as an automatic failover lane. It improves the backup story dramatically, because the WAL stream can produce point-in-time recovery instead of periodic cold snapshots alone.[4][8] What it does not do is abolish the single-primary reality. If the host dies, you recover or promote from replicated state; you do not keep a quorum-backed write service alive in place.

That narrower promise is a strength when the workload is similarly narrow. A single-region app, an internal tool, a small SaaS with one write node, or a control-plane component that mainly needs cheap durability can gain a lot here without giving up local SQLite semantics. The moment you need automatic leader election, local reads on multiple live nodes, or a remote client protocol, you are already asking Litestream to be something it is not.[4][7][8]

2. LiteFS: keep SQLite local everywhere, but appoint one writer

LiteFS takes a different step. Fly's docs say it is a distributed file system built specifically for replicating SQLite databases so each application node can hold a full local copy and respond with low latency.[2][3] That sounds like "distributed SQLite," but the operational center is more precise than that. Because SQLite is still a single-writer system, LiteFS uses a lease mechanism so one node acts as primary at a given time and all writes are directed there.[3]

That decision creates LiteFS's real shape. Reads can stay close to the application on many nodes because each node has a local copy. Writes, however, are serialized through the primary. Under the hood LiteFS intercepts filesystem calls, turns page changes into its own LTX transaction files, tracks replication position with a transaction ID and rolling checksum, and sends changes to replicas asynchronously.[3] When split-brain risk appears, the rolling checksum is what tells LiteFS an out-of-date node must resnapshot instead of pretending its local state is still valid.[3]

The trade is attractive when your application wants SQLite's file semantics and local read performance on multiple machines, but your team can enforce a clear single-writer discipline. It gets shakier when the surrounding platform is too eager to restart or move nodes. Fly's own overview warns against combining LiteFS with Fly Machines autostop/autostart because a stale machine can win the lease and discard newer replicated state.[2] That warning is the whole lesson in miniature: LiteFS is not magic multi-primary SQL. It is a carefully managed one-primary system with fast local replicas.

3. rqlite: the Raft log is the database contract

rqlite is the cleanest example of moving authority completely above the SQLite file. Its design docs say the Raft log stores committed SQLite commands in execution order and that this log is the authoritative record of every change in the system.[5] Every node applies that same log so the SQLite database stays the same everywhere.[5] Once you accept that model, rqlite stops looking like "SQLite, but a little more available" and starts looking like a small distributed database that happens to use SQLite as its execution engine.

That is why the client model changes too. You do not open a local .db file from your application process and let replication happen in the background. You talk to a network service. rqlite exposes an HTTP API, manages leader election and clustering, and pays Raft on the write path by design.[5] The file on disk matters for recovery and runtime, but it is no longer the highest authority in the system. The authoritative object is the replicated log.

This lane fits teams that want explicit consensus semantics and can tolerate the fact that SQLite is no longer app-adjacent local state. For control planes, metadata stores, job schedulers, feature services, or modest shared relational state, that can be a very good trade. The misfit case is equally clear: if the thing you truly value about SQLite is that your app and database share one host and one failure domain, rqlite has already crossed that line on purpose.[1][5][7]

4. dqlite: embed the consensus database inside the cluster

dqlite also turns SQLite writes into Raft-backed cluster state, but it does so in a more embedded form. Canonical's replication docs say the client must connect to the current leader thread, which acts as a gateway between the network client and SQLite.[6] Read transactions can be served directly. For write transactions, the server encodes database changes into a Raft log entry and waits until a quorum has received and persisted those entries before confirming commit.[6]

The more revealing detail is the storage model. dqlite configures SQLite to use a custom VFS that keeps file images in regular process memory, then persists the Raft log rather than ordinary SQLite files as the durable record.[6] In WAL mode, SQLite still expresses its transactional changes through WAL frames and commit markers, but dqlite intercepts that behavior inside its own VFS and replication layer.[6] That makes dqlite feel less like "run this database beside my app" and more like "build this consensus-backed SQL component into my clustered system."

That is a strong fit when the database is part of the application platform itself and leader-aware embedding is desirable. It is a weaker fit when teams want the operational legibility of a simpler single binary or a plain local SQLite file. In other words, dqlite is powerful precisely because it is less invisible than Litestream and less file-local than LiteFS.

5. Decision map by failure model

If you only have time for one architecture conversation, ask which failure you are actually paying to survive.

Choose Litestream first when

Choose LiteFS first when

Choose rqlite first when

Choose dqlite first when

6. One falsifier and one boundary check

A good comparison should say when not to use any of the options. SQLite's own guidance remains useful here: if your workload naturally wants many concurrent writers, a remote client/server database, or a scale profile far beyond one machine's comfortable envelope, then stretching SQLite can become an argument with the workload instead of a solution to it.[7] In that case the clean answer is often to choose Postgres or another server database directly and stop paying translation cost across these layers.

The boundary check is simpler: decide whether the truth of the system should live in one file, one leased primary, or one replicated log. That answer usually determines the tool before feature grids do.

Bottom line

The SQLite ecosystem beyond one host is healthy precisely because the projects are not trying to solve the same problem. Litestream protects one local authority. LiteFS preserves local files on many nodes but appoints one writer. rqlite promotes the replicated log to the top of the stack. dqlite embeds that log-centered model deeper into the cluster itself.[2][3][4][5][6]

If you remember only one rule, make it this one: pick the system whose write-authority story already matches your failure model. That is where the real architectural cost lives.

Sources

  1. SQLite, "Write-Ahead Logging" - WAL reader snapshots, single-writer behavior, and the same-host shared-memory boundary.
  2. Fly Docs, "LiteFS - Distributed SQLite" - project overview, production-status note, and the warning about autostop/autostart causing stale-lease rollback risk.
  3. Fly Docs, "How LiteFS Works" - LTX transaction files, replication position, lease-based primaries, asynchronous replication, and split-brain recovery.
  4. Litestream docs, "How it works" - separate background process model, continuous WAL copying, and disaster-recovery framing.
  5. rqlite docs, "Design and implementation" - authoritative Raft log, WAL-mode SQLite execution, periodic fsync, and clustering mechanics.
  6. Canonical dqlite docs, "Replication" - leader gateway model, quorum-backed commit path, custom VFS, and Raft-log durability.
  7. SQLite, "Appropriate Uses For SQLite" - fit boundaries between local SQLite deployments and heavier client/server database needs.
  8. Simon Willison, "Why I Built Litestream" - independent note on Litestream's streamed WAL replication and why it expands SQLite's practical use cases.
  9. Wikimedia Commons, "File:Server Rack (54126210834).jpg" - source page for the photographic cover image used in this article.