Privacy
Last updated:
This page explains, in plain language, what Archeglyph stores on your behalf, how long we keep it, and who else sees it. It is not legal boilerplate; if you spot a gap, please write to [email protected] and we will fix it.
What we store
- Your email address. Used to send magic-link sign-in emails and, if you opt in, occasional product notices. We never sell it or share it with advertisers.
- Session cookies. A single httpOnly cookie named
ag_sessidentifies your browser to the API. It expires when you sign out or after 30 days of inactivity. No third-party trackers, no analytics cookies. - Uploaded files. PDFs and images you upload to a dataset are stored in object storage under keys that only your account can read.
- Extracted text. The text our pipeline extracts from those files, the per-region bounding boxes, and the name of the engine that produced each block.
- Derived embeddings and clusters. Vector embeddings of your text, a per-dataset lexical index, and the clusters we fit over them. These live inside your dataset's snapshot bundle.
- Audit trail. A short log of dataset mutations (who created, uploaded, or re-ran what) so you can reconstruct how a dataset got to its current state.
What we don't store
- IP-address-level request logs beyond 30 days.
- Any payment information (the product is not charged yet).
- Analytics from third-party trackers. There are none on the public site or in the app.
Retention
- Files, extracted text, embeddings, and clusters are retained for as long as the parent dataset exists. Deleting a dataset deletes all of them, including the snapshot bundle, within 30 days.
- Deleting your account removes all datasets you own and your email from our database within 30 days.
- Session records are retained for 90 days after expiry for security forensics.
Third-party processors
A small number of services see your data in the course of running the product. Each is listed with the category of data it sees:
- Ollama Cloud — when you pick a hosted VLM or text model for layout assessment, extraction, or cluster labels, the relevant page image or text chunk is sent to Ollama Cloud for that request. Ollama's privacy terms apply to that transit.
- Email relay (the SMTP provider we route outbound mail through, configured per deployment) — sees your email address and the magic-link message body.
- Hetzner — our servers live in Hetzner's EU data centres. Your files, extracted text, and database rows sit on Hetzner disks.
- MinIO — self-hosted on Hetzner; holds original files, page images, and dataset snapshot bundles. Not a separate processor; listed for completeness.
Your rights
You can download, rename, or delete any dataset at any time. You can export your account's datasets as snapshot bundles and delete your account from the settings page. For GDPR data access, rectification, or erasure requests outside of those flows, email [email protected].
Contact
Privacy questions: [email protected]. We aim to respond within two working days.