Progressive Operations#
How storage backends evolve as query complexity grows. Each tier unlocks new operations, but requires structural commitments the previous tier doesn’t.
Structuredness comes down to which operations a backend supports, and schemas and checks are the means: enforcing checks is what makes new operations available. The core thesis follows: many knowledge systems start as filesystems and progressively acquire database-like structure. The progression isn’t arbitrary, each tier is driven by a class of operations that can’t be satisfied at the previous level.
Tier 1, Filesystem#
Structural commitment: none beyond path conventions
Operations unlocked:
- Read/write by path
- List/enumerate (directory traversal)
- Full-text search (grep-style, substring match)
- Vector/semantic search, operates on raw content; no schema required
Limitations:
- Queries are global scans (no index)
- No structured fields to filter on
- No relationships between files
Tier 2, Document Store#
Structural commitment: optional schemas (e.g. frontmatter conventions). Not enforced, but consistently applied.
Operations unlocked:
- Query by structured fields across documents (“all people where
closeness: close”) - Faceted search, filter + sort by frontmatter fields
- Vector search becomes schema-aware (can filter semantic results by field values)
- Field-level updates (change one field without rewriting the whole file)
Limitations:
- No enforced referential integrity: relationships are naming conventions, not constraints
- Aggregations are fragile (depend on field consistency)
- Many-to-many relationships require awkward denormalization
Tier 3, Relational#
Structural commitment: schemas required, foreign keys, typed fields
Operations unlocked:
- Relational queries (“meetings attended by this person”, “all open action items from meetings this month”)
- Foreign key constraints, referential integrity enforced
- Aggregations (“intros sent per quarter, by status”)
- Time series, just a table with a timestamp column; no special tier needed
Limitations:
- Many-to-many relationships require join tables (that’s Tier 4)
- Schema migrations have real cost
Tier 4, Join Tables#
Structural commitment: intersection tables for many-to-many relationships
Operations unlocked:
- True many-to-many queries (“all people who attended meetings tagged #fundraising”)
- Proper intersection entities (a meeting_attendee row can carry its own fields: role, spoke_time, etc.)
- More complex relational queries without denormalization
Tier 5, Graph#
Structural commitment: relationships are first-class entities with their own attributes and types
Operations unlocked:
- Multi-hop traversal (“who introduced me to someone who works at [firm]?”)
- Relationship-typed queries (“what projects is this person a collaborator on vs. a contact for?”)
- Path queries (“how am I connected to X?”)