Design decisions

The big calls behind imprnt: spend the AI model rarely, search with plain math, keep your data complete, and let nothing run on its own. Most take a lesson from PAI, the personal-AI system imprnt grew out of, and steer the other way. Each card opens into the reasoning behind one call, and each keeps its flips if line in plain view: the one change that would undo the decision.

Ration the model

Pay the model once per note, at filing. Search runs on free local math.

The model is the expensive part, so imprnt spends it where one pass pays off forever: reading, sorting, tagging, and summarizing each item once, at filing time. Search is the opposite case. You run it all day, so it stays on cheap local code with no model in the loop. How it works walks the split end to end.

Filing a note once per item

the model

Searching thousands of times a day

plain local code

The measuring stick recorded here: “the dumbest thing that works” is measured against the model. By that stick, plain ranking math (the BM25 formula, next call down) is the default ranker, never an opt-in.

BM25, not embeddings

Search ranks with decades-old search-engine arithmetic. No AI touches a query.

recall ranks with BM25, a formula search engines have used for decades: plain arithmetic over how often your query’s words appear in each note, and how rare those words are across the vault. Most AI tools reach for vector search instead, where a model turns every note into a long list of numbers (an embedding) and compares distances. recall runs no rerank, no embeddings, no vector store, and no protocol layer over the vault. The mechanics live in How it works. Two calls are specific to this page.

Today recall reads every note on every query, and that holds to about 10k notes. Past that, the path stays local: a fast text pre-filter in front of the ranking carries it to about 100k, and a word index kept on disk (what search engines call an inverted index) carries it past that. Scaling adds a local index, never a vector store or a model.

The other call is the trade itself. Vector search sounds smarter but costs more, re-indexes on every change, and hides why a result matched. BM25 over plain text is free, clear, and good enough once notes are tagged well on the way in.

BM25 over plain text imprnt picks this

Vector search what most tools ship

Cost

Free local arithmetic

A model embeds every note and every query

On a change

Scores the new text on the spot

Re-indexes the changed notes

Why it matched

Plain: the shared terms

Hidden inside the vectors

The data is the knowledge

A note carries the source's full payload. The summary comes on top, never instead.

A note must carry the source’s real payload, whole: the tables, the numbers, the rows, the exact wording. Vault layout states the rule and the test that checks it.

The rule earned its teeth from a real failure. During a vault rebuild, the prose summaries were kept while the tables stayed behind in raw/, the folder of untouched originals that search never reads. Nothing errored, and the rows were silently deleted from everything search could find. The fix is a standing rule, filing adds and never removes, plus a check that can fail, the lookup test: could you answer a specific question from the vault note alone?

the rebuild that failed

Search reads vault notes only, so every row turned invisible.

the rule now

The summary sits on top of the data, never in its place.

Domain-first folders

Folders follow your life: health, money, work. Search never looks at them.

An earlier layout sorted notes by type, and it produced a junk drawer: a single typeless notes/ bucket grew to nearly half the vault, because most of what a person keeps is reference material about some area of life, with no obvious type to file under. The fix is to sort by life-area, the way you actually browse your own knowledge. Folders are browse-drawers only. Search ignores them entirely, so no layout choice ever costs a result.

sorted by type, before

Most knowledge is reference about some area of life, with no type to file under. It all fell into one typeless bucket.

sorted by life-area, now

Notes file where you would browse for them, and every person, organization, and holding keeps one linkable home.

One piece of the type-first idea survived. People, organizations, and holdings (owned things with state worth tracking, like an account or a policy) each keep one cross-cutting home, because an entity is referenced from every area of life and needs a single canonical note. That linkable entity graph is the one real gain over PAI, which kept knowledge as prose files with no typed layer for people and things.

Beyond the vault and three commands, everything is a plugin the core cannot see.

Core is the vault plus three commands: ingest, recall, and check. Everything else is a plugin. The rule, and the test that keeps it honest:

the one rule, and its test

You can add or remove any plugin with zero edits to the core.

This is the most important call on the page, and a direct lesson from PAI. A core bloats when the dependency arrows point from core to module: the core imports, validates, and loads every subsystem, so every new capability grows it. PAI was built that way, and its core carried every feature as always-on overhead, paid in tokens and misfires whether you used the feature or not.

imprnt points the arrows the other way. A plugin depends on exactly two things, your notes and its own folder, and the core stays blind to both.

PAI A fixed set of subsystems, taken whole, each one always-on overhead.

imprnt A small core plus plugins you compose, each with a real off-switch.

Plugins copy the dozen lines they need instead of importing them.

A plugin needs roughly twelve lines of code to read a note’s header. Those lines are copied into every plugin, and the core publishes no code for plugins to import. The reason is reversibility, and the two ways this can go wrong are nowhere near symmetric:

if copying goes wrong Merge the copies into one shared file Five minutes, breaks nobody.

if sharing goes wrong Un-publish code plugins already import A breaking change, plus a magnet for "can it also do X?" creep.

Copy is the move you can undo.

Commands, never a daemon

Nothing runs in the background just by being installed.

A daemon is a program that runs in the background on its own. imprnt ships none. The core and every plugin ship commands, and scheduling them through your machine’s own scheduler (cron, launchd) is your opt-in. A tool that works without you ever thinking about it is also a tool you cannot audit, cannot stop, and do not fully own.

PAI Background machinery running by default, quietly billing you in tokens and misfires.

imprnt Commands that sit still until you schedule them, so every run is one you chose.

Invariants are tests

A design promise counts only if a check can catch it failing.

The most expensive lesson, and another one from PAI, which carried seventeen principles written as values (be private, ration the model). A value cannot be checked against reality, so it rots into recitation. In imprnt, a principle earns its place only as a check that can fail, one that points at a specific breakage it would have caught.

a test "On the real corpus, recall tops the right note" Catches a regression on day one.

a value "Deterministic-first" Gets quoted in a meeting.

Private by being private

The vault is private by never leaving your machine, with no redaction machinery on top.

The whole vault lives on your machine, readable by your account alone. There is no sensitivity label, no redaction pass, no fencing of secrets, because the vault is meant to hold everything, including medical, financial, and personal records.

the only rule

The vault never goes near a public repo.

Publishing a subset, say an article drawn from your notes, is an export-time filter you run consciously, once. Filing stays free of that tax.

imp vs imprnt

One package, two commands: imp for you, imprnt for the machinery.

The package ships two commands from one dispatcher, split by audience. imp is yours: typed in any directory, it opens your assistant’s session there with imprnt riding along, Claude by default (imp --gemini or imprnt agent swaps the backend), and imp lair opens the session inside the vault itself. imprnt is for the machinery: agents and scripts call imprnt recall, imprnt check, and the rest.

one package, one dispatcher

for you imp

Typed in any directory. Opens your assistant's session there with imprnt riding along, Claude by default.

imp lair opens the session inside the vault

imp --gemini swaps the backend

for machinery imprnt

Called by agents and scripts, the machine-facing half of the package.

imprnt recall search the vault

imprnt check audit notes, rebuild the index

Two earlier models lost this fight. Wiring imprnt into the assistant’s global config pays tokens in every session forever, the always-on cost again. A launcher-only design left the vault unreachable from the coding repos where most of the day happens. What won is consent per keystroke: typing imp instead of claude is the entire opt-in, and plain claude stays plain.

Context loads on demand

A session starts with a small pointer. The full filing rules load only when you write.

A session’s upfront load follows the same frequency axis as everything else. Always loaded: your enabled behavior plugins, plus a small pointer that says the vault exists, when to search it, and where to write. The full filing contract loads on demand through imprnt context, so the rules are paid for only by the sessions that actually write, at the moment they write.

every session always loaded, small

your behavior plugins a vault pointer

The pointer says the vault exists, when to search it, and where to write.

only when you write printed on demand

imprnt context prints the full filing contract