← Blog

Next-Token Prediction

NLP

/dictionary/next-token-prediction

Definition

The training objective of every modern LLM: given a sequence of tokens so far, predict the most likely next token. Run this in a loop and you get ChatGPT.

Posts that use this term

  • The local-LLM vocabulary

    Parameters, B, dense vs MoE, base vs instruct, tokens, context window, chat template, GGUF, quantization suffixes. After this post you can read any HuggingFace model card.

  • From models to LLMs

    An LLM is one kind of ML model — trained on text, predicts the next token. That single trick at scale gets you ChatGPT, and also explains where it breaks.