What a good provenance badge looks like
UX writing about the transparency contract: what goes inside the badge, what gets omitted, and why the re-run affordance lives next to it. With ASCII mockups of the patterns we use in the review screen, search results, and cluster cards.
By Dipankar · Last updated
On this page
If a provenance badge is a promise that “this specific output was produced by this specific engine,” the badge has to be readable without training. It has to answer three questions at a glance: what model, what version, and can I try another? It has to do that in a row of search results without ballooning the row. And it has to mean the same thing whether it sits beside an extracted paragraph, a cluster title, or a vector search hit.
We have iterated on the badge several times during M0. These notes describe where it landed and why.
Anatomy
A badge is three fields rendered as one pill:
┌─────────────────────────────────────────┐
│ qwen3-vl:235b-cloud · v2025.03 · 02:14 │
└─────────────────────────────────────────┘
- Engine id. The left-most field is the stable identifier used across the catalogue.
Tesseract reads as
tesseract, a VLM reads as its full Ollama tag. We never shortenqwen3-vl:235b-cloudtoqwen— abbreviation was one of the first temptations and one of the first rejections, because “qwen” alone is not a citable reference. - Version. For binary engines (Tesseract) this is the upstream semver. For cloud-backed VLMs this is a date tag that we reconcile nightly against the provider. If the provider does not expose a version, we surface the date we first observed that model id in our engine catalogue.
- Timestamp. HH:MM of when this specific block was produced. Not the full ISO-8601 (which clutters), but enough to disambiguate the pre-review output from a re-run.
A badge never carries confidence scores. Confidence is useful in the review pane and on the advanced panel of a cluster card, but folding it into the badge would pressure readers to treat it as the headline number, and the headline of a provenance badge is who produced this, not how sure they were.
Where badges appear
In the review pane
┌─ Region 14 ──────────────────────────────────────────────────────┐
│ "reported from the wharves of Galata that the Russian │
│ steamer..." │
│ │
│ [ tesseract · 5.3.0 · 02:14 ] [ accept ] [ re-run with ⌄ ] │
└──────────────────────────────────────────────────────────────────┘
The badge sits on the same row as the accept and re-run controls because those three things compose one decision: I have seen what produced this, I know my options, I choose to accept or rework. If the badge were in a tooltip, the action would lose the attribution that justifies it.
In search results
#42 p=0.812 Document 117, p.3
"…reported from the wharves of Galata that the Russian steamer…"
[ tesseract · 5.3.0 ] [ embed: bge-small-en-v1.5 ]
Search results have two badges: the engine that extracted the text, and the model that embedded the chunk. We show both because a user comparing two search results can form a legitimate hypothesis like “the MiniLM rows rank differently from the BGE rows” only if both badges are visible side by side.
In cluster cards
┌─ Migrations across the Bosphorus ────────────────────────────────┐
│ Fourteen fragments, mostly port reporting from 1897–1901. │
│ — theme_llm: gemma3:27b-cloud │
│ │
│ "the wharves of Galata..." — Doc 117, p.3 (tesseract) │
│ "steamers inward bound..." — Doc 204, p.1 (tesseract) │
│ "lo riferiva il console..." — Doc 91, p.2 (qwen3-vl) │
└──────────────────────────────────────────────────────────────────┘
On a cluster card, the theme-writing model is badged at the top of the card and each exemplar carries its extraction engine. The rule is that every human-readable string the product did not author with a keyboard has a badge somewhere within a one-glance radius.
What we decided not to do
- No “AI generated” disclaimer. A badge that says
qwen3-vl:235b-cloudis a piece of scholarly apparatus. A banner that says “generated by AI” is a legal posture. We made the mistake in an early prototype of bolting both on; readers ignored the banner entirely and dismissed the badge as redundant. We kept the badge. - No colour-coded risk. We tried a green/amber/red scheme where high-confidence extractions got a muted badge and low-confidence ones got a warn tint. Reviewers read the colour as a judgement on the engine rather than the region, and argued with it. We moved confidence to the region tint instead, where it belongs.
- No vendor logos. A badge is text. Logos turn provenance into branding, and the moment a researcher sees a logo they stop treating the badge as information and start treating it as an endorsement.
The re-run affordance
The badge is paired with a re-run with… trigger that opens a popover. The popover is split
into two tabs (OCR, VLM) with the current engine pre-selected and greyed out. Re-running
produces a new row in the region’s history; the badge swaps to the new engine id but the
previous row is still available from the row-history disclosure on the left edge of the
region card.
The re-run button is never the default. In the review pane, Accept is the large button;
re-run with… is a secondary. In search results and cluster cards, the badge is purely
informational and the re-run affordance is gated behind clicking through to the review
pane. We resisted every design iteration where a researcher could re-run a region from a
search result, because the cost of mis-clicking a re-run in a scanning view is two minutes
of compute and a brief jitter in their own mental model of the dataset.
The transparency contract, stated plainly
What the badge promises:
- Every textual output in the product carries, on the same screen, an attribution to the engine that produced it.
- Engine ids are stable: what appears in one snapshot resolves to the same model identity in every future snapshot.
- Every output paired with a badge has a re-run path that is one or two clicks away.
What the badge does not promise:
- That the engine is correct.
- That the engine’s weights will remain available upstream.
- That we have any editorial opinion about the engine’s output.
The badge is a pointer, not an endorsement. That is the whole shape of the transparency contract, and it is the reason we obsess about the pixels.