How it works

imprnt is plain markdown files on your computer. No database, no cloud, no app to log into. You talk to your assistant, and it drives imprnt underneath, filing what you tell it and recalling what you ask.

A model costs money every time it thinks, and it forgets everything between sessions. So imprnt gives it exactly one job, the one that needs real understanding, and keeps it out of the mechanical work of parsing, filing, and searching. Which side a step lands on comes down to one thing: how often the step runs.

write path Spend the model once runs once per note

you Hand over a source a transcript, a doc, a stray fact

model · paid It reads and sorts summary, tags, links, folder

result Clean notes one source can fan out into several

read path Search with cheap local code runs thousands of times

you Ask in plain words any question, all day long

code · free Code ranks every note counting, no model, no server

model It reads the top hits a handful of notes, then answers

That is the whole trick: pay the model for the rare hard job, keep it out of the job you do a hundred times a day.

When you file: the model works

Hand over a transcript or a document, and imprnt runs one pass of four steps. Only one of them spends the model.

Copy code

The source goes into raw/ untouched, hashed, with the obvious structure pulled out. Free.
Understand model the only paid step

For each thing in the source: pick its type and folder, write a one-line summary, pull out the decisions and actions, and tag it with the words you will search for later, even if the source never used them. A dense source fans out into several small notes.
File code

Look up people by name and alias, flag anything unresolved for review, and write the note without ever overwriting an existing one. Merging a renamed person into their old note is the model's call, made back in step 2.
Tidy code

check rebuilds the index and flags broken links, topic and event notes that link to no person, organization, or thing, and any note with no tags. It never blocks and never silently drops anything.

The rules for where a note lands are in Vault layout. Handing the model a source you cannot place yourself is exactly what step 2 is for.

Edits are cheap, originals are forever

renamed

Carl, Engineer Boris, Director, right everywhere

one edit, old name kept as an alias

contradicted

the old date the new date, the stale one still traceable

old line stamped superseded, never overwritten

One edit fixes a fact everywhere it appears. Say you filed a colleague as Carl, an Engineer, and a later meeting reveals he is Boris, a Director. Filing looks him up by alias, finds the existing note, corrects that one note, and keeps "Carl" in the aliases so old references still resolve. Every other note points at him by his permanent file ID, so the single fix shows the right name and role across every meeting and project.

A contradiction is the special case. If a new meeting says a date moved, filing updates the note and stamps the old line as superseded, so search can tell the live fact from the stale one. Old information is marked, never quietly overwritten.

Your raw sources are kept forever, one untouched snapshot per source under raw/. To change how notes are structured, re-run filing over the originals and get the new layout for free. You are never stuck in an old format, and any claim in a note traces back to its snapshot.

When you read: plain code works

Ranking is arithmetic, so no model touches it. recall scores your notes with BM25, a formula from the 1990s, pure arithmetic, no server. Three things set a note’s score:

how often Repeats count

says your word 4 times says it once

Your search words are counted in every note.

how rare Rare words weigh more

"osteopath" "meeting"

A rare word narrows the search more than a common one.

where it sits Title beats body

in the title in the body

The same word counts more in a note's title than in its text.

The model only turns your question into search words at the start and reads the few best notes at the end. It never does the ranking, and nothing AI-made sits under the search: no embeddings, no vectors (numeric codes a model would have to compute for every note). Because a rare matched word floats to the top on its own, you get a short, sharp list of hits, not a dump of the whole vault. We put a number on how well that holds up.

9 in 10 ~97% right note ranked first in the top five the assistant reads

Measured across the two example vaults and 39 everyday questions. A small test, and the number will move. Run bun run eval to check it yourself.

The eval/ folder holds the test harness.

The whole loop, step by step

Who does each step, and why:

Step	Who	Why
Copy the source, hash it, log it code	code	mechanical, exact, free
Read messy prose to find its shape model	model	nothing to parse, it takes reading
Pick the type, write the summary, pull decisions model	model	needs real understanding, the conscious work
Assign tags, set the kind, wire the links model	model	judgment about meaning, paid once
File the note in its folder code	code	once the type is decided, writing is mechanical
Rebuild the index from every summary code	code	a plain read over the note headers
Rank notes for a search code	code	fast, free, clear, over thousands of notes
Turn a question into search words, read the top hits model	model	it is the interface, with the question in hand

Why no search server

Every tool call the model makes costs tokens, the units AI work is billed in, whatever the wires look like. The two levers are payload size and caching. So imprnt does the heavy scan in code, hands the model a tight result, and caches locally to skip the re-fetch.

Payload size

live server the model re-fetches and re-reads on every read

imprnt code does the heavy scan, the model gets a tight result

Caching

live server a round-trip on every read, nothing cached

imprnt one batched sync caches locally, every read after is free

A live server answering search queries breaks both levers, which is why there is no query layer over the vault. A plugin’s sync is fine: one batched call that caches locally, and everything after it reads the cache.

Core plus plugins

The only thing always present is the core: the vault plus ingest, recall, and check. Everything else is an optional plugin you add or delete with one command, under one rule that keeps the core small: the core never depends on a plugin. The how is in How plugins work, the why in Design decisions.