Archēglyph

Embedding

A list of numbers — usually 384 to 1024 dimensions — that represents the meaning of a chunk of text.

Last updated

An embedding is a vector of numbers that represents the meaning of a chunk of text. Two passages with similar meaning end up with vectors close to each other in that high-dimensional space. Embeddings are the trick behind semantic search, clustering, and neighbour passages.

Why it matters for your research. Embeddings are what make “find things similar to this paragraph” possible across a corpus, regardless of whether the query’s words appear in the match. An embedding is a measurement, not an answer; it places passages in a geometry so that other tools (search, clustering) can act on them.

In Archēglyph. Every chunk in your bundle is embedded by a named embedding model, and the model id is recorded. Clusters, neighbours, and semantic search all operate on these vectors.

Not to be confused with. An embedding is not a database row or a summary of the text. It is a point in space; distances between points encode similarity.

Related terms

References

← Back to the glossary