17 April 2026 · transparency · ux · design

What a good provenance badge looks like

UX writing about the transparency contract: what goes inside the badge, what gets omitted, and why the re-run affordance lives next to it. With ASCII mockups of the patterns we use in the review screen, search results, and cluster cards.

By Dipankar · Last updated 17 April 2026

On this page

Anatomy
Where badges appear
In the review pane
In search results
In cluster cards
What we decided not to do
The re-run affordance
The transparency contract, stated plainly
Unfamiliar with a term?

If a provenance badge is a promise that “this specific output was produced by this specific engine,” the badge has to be readable without training. It has to answer three questions at a glance: what model, what version, and can I try another? It has to do that in a row of search results without ballooning the row. And it has to mean the same thing whether it sits beside an extracted paragraph, a cluster title, or a vector search hit.

We have iterated on the badge several times during M0. These notes describe where it landed and why.

Anatomy

A badge is three fields rendered as one pill:

┌─────────────────────────────────────────┐
│ qwen3-vl:235b-cloud · v2025.03 · 02:14 │
└─────────────────────────────────────────┘

Engine id. The left-most field is the stable identifier used across the catalogue. Tesseract reads as tesseract, a VLM reads as its full Ollama tag. We never shorten qwen3-vl:235b-cloud to qwen — abbreviation was one of the first temptations and one of the first rejections, because “qwen” alone is not a citable reference.
Version. For binary engines (Tesseract) this is the upstream semver. For cloud-backed VLMs this is a date tag that we reconcile nightly against the provider. If the provider does not expose a version, we surface the date we first observed that model id in our engine catalogue.
Timestamp. HH:MM of when this specific block was produced. Not the full ISO-8601 (which clutters), but enough to disambiguate the pre-review output from a re-run.

A badge never carries confidence scores. Confidence is useful in the review pane and on the advanced panel of a cluster card, but folding it into the badge would pressure readers to treat it as the headline number, and the headline of a provenance badge is who produced this, not how sure they were.

Where badges appear

In the review pane

┌─ Region 14 ──────────────────────────────────────────────────────┐
│ "reported from the wharves of Galata that the Russian           │
│  steamer..."                                                    │
│                                                                 │
│ [ tesseract · 5.3.0 · 02:14 ]  [ accept ]  [ re-run with ⌄ ]    │
└──────────────────────────────────────────────────────────────────┘

The badge sits on the same row as the accept and re-run controls because those three things compose one decision: I have seen what produced this, I know my options, I choose to accept or rework. If the badge were in a tooltip, the action would lose the attribution that justifies it.

In search results

 #42  p=0.812   Document 117, p.3                                 
 "…reported from the wharves of Galata that the Russian steamer…" 
 [ tesseract · 5.3.0 ]  [ embed: bge-small-en-v1.5 ]

Search results have two badges: the engine that extracted the text, and the model that embedded the chunk. We show both because a user comparing two search results can form a legitimate hypothesis like “the MiniLM rows rank differently from the BGE rows” only if both badges are visible side by side.

In cluster cards

┌─ Migrations across the Bosphorus ────────────────────────────────┐
│ Fourteen fragments, mostly port reporting from 1897–1901.       │
│ — theme_llm: gemma3:27b-cloud                                   │
│                                                                 │
│ "the wharves of Galata..."         — Doc 117, p.3  (tesseract)  │
│ "steamers inward bound..."         — Doc 204, p.1  (tesseract)  │
│ "lo riferiva il console..."        — Doc 91,  p.2  (qwen3-vl)   │
└──────────────────────────────────────────────────────────────────┘

On a cluster card, the theme-writing model is badged at the top of the card and each exemplar carries its extraction engine. The rule is that every human-readable string the product did not author with a keyboard has a badge somewhere within a one-glance radius.

What we decided not to do

No “AI generated” disclaimer. A badge that says qwen3-vl:235b-cloud is a piece of scholarly apparatus. A banner that says “generated by AI” is a legal posture. We made the mistake in an early prototype of bolting both on; readers ignored the banner entirely and dismissed the badge as redundant. We kept the badge.
No colour-coded risk. We tried a green/amber/red scheme where high-confidence extractions got a muted badge and low-confidence ones got a warn tint. Reviewers read the colour as a judgement on the engine rather than the region, and argued with it. We moved confidence to the region tint instead, where it belongs.
No vendor logos. A badge is text. Logos turn provenance into branding, and the moment a researcher sees a logo they stop treating the badge as information and start treating it as an endorsement.

The re-run affordance

The badge is paired with a re-run with… trigger that opens a popover. The popover is split into two tabs (OCR, VLM) with the current engine pre-selected and greyed out. Re-running produces a new row in the region’s history; the badge swaps to the new engine id but the previous row is still available from the row-history disclosure on the left edge of the region card.

The re-run button is never the default. In the review pane, Accept is the large button; re-run with… is a secondary. In search results and cluster cards, the badge is purely informational and the re-run affordance is gated behind clicking through to the review pane. We resisted every design iteration where a researcher could re-run a region from a search result, because the cost of mis-clicking a re-run in a scanning view is two minutes of compute and a brief jitter in their own mental model of the dataset.

The transparency contract, stated plainly

What the badge promises:

Every textual output in the product carries, on the same screen, an attribution to the engine that produced it.
Engine ids are stable: what appears in one snapshot resolves to the same model identity in every future snapshot.
Every output paired with a badge has a re-run path that is one or two clicks away.

What the badge does not promise:

That the engine is correct.
That the engine’s weights will remain available upstream.
That we have any editorial opinion about the engine’s output.

The badge is a pointer, not an endorsement. That is the whole shape of the transparency contract, and it is the reason we obsess about the pixels.

Unfamiliar with a term?

Provenance — the concept the badge exposes in the UI.