Architecture

A wide museum-exhibit visualization on a clean white background. Along the bottom, a horizontal dock of recognizable app icons — Gmail, WhatsApp, Slack, Teams, Outlook, Drive, Calendar, Photos, Voice memos — labelled DATA SOURCES, with delicate floating data fragments rising up into the centre. In the upper centre, a compressed 6-layer pipeline (L0 raw to L5 memory) reading bottom-up. On the left and right, two distinct storage worlds — one labelled file store, one labelled database — both fed by the pipeline.

Your apps already produce the data. Two storage worlds. One pipeline. One library. No migration, no re-platform — read what you already have.

Files canonical. The database is just the index.

Two databases. Five stages. Everything ground-truth on disk.

Drop the database. Rebuild the office from the files on disk in seconds. That is what filesystem-canonical means. The DB is derived state. The files are real.

Two databases — one for the worker, one for the library

DB A · RUNTIME

The Agent DB

Who is doing what, right now.

agents
routines
issues
wakeups
approvals
channels

DB B · LIBRARY

The Office Library

What the office collectively knows.

emails & threads
conversation turns
files index
post-its
casefiles
entities & mentions

Files on disk stay where they live — inbox, documents, photos, chat logs. Both databases are indexes over these files. Never rewrite the originals.

Five stages — from raw evidence to a final story

Pull

Gmail · Outlook · WhatsApp · ERP · bank. Raw bytes land on disk; an index row points to them.

Convert

PDF · DOCX · XLSX · photo · audio → readable .md sidecar that points back to the raw byte.

Post-it

Many perspectives read the same file. Each writes one short post-it from its lens.

Story

Post-its cluster by casefile. A story is rewritten when enough new post-its arrive.

Final story

All active stories woven into one narrative. Rewritten daily. Loaded at every cold-start.

Where the work happens

RUNTIME WORK

DB A drives the pipeline

Every stage above is an agent row in DB A. Each fires on a routine or an event.

pull agents wake on schedule
converter wakes on a new evidence row
perspectives wake on a new readable file
story-builder wakes on a post-it cluster threshold
final-story wakes on a daily routine

LIBRARY WRITES

DB B receives everything

Each stage writes rows to DB B that point to the files on disk.

stage 1 → files / emails rows
stage 2 → body_md_path set on the row
stage 3 → one post-it per perspective
stage 4 → story.md on disk, hash in casefiles
stage 5 → final-story.md on disk, hash in casefiles

Why files-on-disk, not rows in a database

Tools just work

grep · cat · git · markdown viewers · editors — all native.

Offline-readable

No connection. No service. The file is right there on disk.

Survives DB disaster

Drop the index. Re-scan disk. The library is whole.

Provenance native

Every claim resolves to a byte address — file · line · byte.

Easy to share

Attach to email · drop in Slack · open in any viewer.

Lightweight DB

Postgres holds the index, not the content. Stays small. Stays fast.

Where this office runs

Ordinary infrastructure. Azure, AWS, GCP, or your own data centre. Filestore for the documents. Postgres for the two indexes. Brains reached via APIs. Nothing magical about the hardware — the sophistication is in how the parts fit together.

FilestorePostgrespullersconvertersfilesystem-canonicalDB-as-index

How the library is built

A pipeline of agents — pullers, converters, perspective readers, story-builders — turns raw evidence into a structured library. Each stage writes back to disk, with provenance carried at every step.

A wide museum-exhibit visualization on a clean white background showing the library-building pipeline: raw evidence on the left feeding through pullers, converters, perspective readers, and story-builders, with files and database indexes on the right.

Stage by stage, perspective by perspective — how the library is built from what your apps already produce.