It’s quite rare for software to offer both a real local file format and also substantial cloud/SaaS behavior (ie other than syncing). Usually it’s one or the other—if the cloud has active behavior, apps are thin clients which don’t expose an accessible on-disk format (eg Notion).
Conversation
Replying to
The troubles are well articulated here: inkandswitch.com/local-first.ht
I recently finished re-architecting to expose an on-disk format, thought I’d write a bit about what I learned.
1
5
68
It’s hard to pull this off, as I&S describe. You need to design a file format which syncs efficiently and reliably, which is already tough. Gets tougher when your SaaS is an active replica with its own public API, and when you layer on indexing/querying, migration, sparsity.
1
29
There are enterprise solutions which solve these problems (Realm, CouchDB), but one consistent limitation is that they “own” the file format, and often the server component as well. I don’t feel good exposing my on-disk format to local clients if it’s tied to a specific backend.
1
17
CRDTs are a useful piece of the puzzle, but they don’t solve the problem as fully as people often imagine. The I&S folks are working diligently on many of these. But one problem I don’t know how to solve is the complexity of the resulting opaque file formats.
1
30
In an ideal world, everything’s just plaintext, so you can modify it with whatever tool you like. No libraries needed. Problems with this include semantic sync, indexing, structured data, etc. SQLite is almost as good, if the schema is well-defined: sqlite.org/appfileformat.
1
30
Maybe someday automerge’s serialization format will be as durable and universal as SQLite’s, but in the meantime, it’s certainly not amenable to casual concurrent local access in some user script.
1
1
18
Of course, SQLite leaves you to solve syncing. The typical approach is a write-only log of events (CRDT mutations or otherwise) which can be replayed across replicas. Snapshots of entity states are computed from queries across these events.
1
17
That’s the approach Orbit takes: simple event structures (simpler than CRDTs, sacrificing some of their key guarantees to reduce complexity), with well-defined merge operations, in a SQLite db. Users can write scripts to insert their own events if they like.
2
1
24
I’m still not satisfied with this. Reading/writing data yourself requires using Orbit’s libraries or a SQLite library and an understanding of the schema. I’d like to just expose the data as plaintext, but I’m not yet sure how to achieve that practically.
3
21
One approach is to define Git-style “plumbing” CLI primitives which offer plaintext-like I/O to your data structures. Another is to define a FuseFS layer, style.
1
13
Switching gears: another hard thing about implementing this type of data format is that web browsers demand their own implementation. kindly implemented an IDB-based Orbit backend. Excited about absurd-sql here:
1
19
Orbit’s cloud server has its own (third, ugh) backend implementation, which it uses to offer APIs, aggregate analytics (for my research), and services like study reminder notifications.
1
12
I see now why people don’t generally do this. It was a huge amount of work. I’m honestly pretty frustrated with myself for shaving this yak: too much time polishing my telescope, too little time looking at stars.
1
3
51
As it happens, Orbit’s implementations are quite general (i.e. they have almost no specific knowledge of Orbit’s structures), so perhaps they’ll be of use to others. See store-*, sync, and backend packages here:
4
34
