Training vs inference
Training is how a model learns — done once, expensively. Inference is how the model is used — each prediction is a forward pass over frozen weights.
Last updated
Training is how a model learns — it happens once (or a handful of times), costs a lot of compute, and produces the set of weights that define the model. Inference is how the model is used — each prediction is a forward pass over those frozen weights. Inference is much cheaper per call, but a busy service pays inference costs continuously.
Why it matters for your research. When you read “AI is expensive”, distinguish which phase is meant. Training a frontier LLM costs millions and happens at one or two labs. Using one costs cents per query. Privacy also lives mostly at inference time: “your data went to OpenAI” means their inference servers saw it, not that they retrained on it.
In Archēglyph. Inference only. We never retrain models on your corpus; the corpus stays in your bundle. Inference happens either locally (sentence-transformers for embeddings) or via a cloud provider with our API key, and the model id is recorded either way.
Not to be confused with. Fine-tuning is a small, targeted slice of training, not inference.