Next-Token Prediction

NLP

/dictionary/next-token-prediction

Definition

The training objective of every modern LLM: given a sequence of tokens so far, predict the most likely next token. Run this in a loop and you get ChatGPT.

Posts that use this term

The local-LLM vocabulary
Parameters, B, dense vs MoE, base vs instruct, tokens, context windows, chat templates, GGUF, and quant suffixes. Read it once and any HuggingFace model card stops being scary.
From models to LLMs
An LLM is one kind of ML model, trained on text, predicts the next token. That single trick at scale gets you ChatGPT, and also explains where it breaks.