Guides
Short, practical walkthroughs. Aimed at the working researcher, not the implementer — if you want the engineering details, see the articles.
- export · archiving · snapshots · how-to
Exporting and archiving a dataset
A forward-looking but grounded walkthrough of Archeglyph's dataset snapshot: what goes into the tarball, how to open it without the product, and how to cite a snapshot in a paper.
- review · ocr · vlm · how-to
Reviewing a noisy scan
A walkthrough of the review screen on a low-quality scan: what to look for, how to read the confidence tint, and when to re-run a region — or the whole page — with a VLM instead.
- extraction · ocr · vlm · decision
OCR vs VLM: a practical chooser
A short, decision-oriented guide to picking the right extraction engine for your corpus. When Tesseract is the right default, when a VLM is worth the cost, and how to test the choice cheaply.
- getting-started · tutorial
Your first dataset
End-to-end walkthrough: sign in, create a dataset, upload pages, watch the pipeline run, review a document, and run your first search.
- pipeline · overview
The pipeline
A plain-language tour of the four stages a document passes through in Archeglyph: upload, assess, extract, and analyse. Written for the person using the product, not the person building it.