← Blog

RLHF

ML

/dictionary/rlhf

Definition

The training stage where a pre-trained LLM is tuned with human preferences (people rank the models outputs, the model learns to produce the ones humans prefer). Turns a raw text predictor into a useful assistant.

Posts that use this term

  • From models to LLMs

    An LLM is one kind of ML model — trained on text, predicts the next token. That single trick at scale gets you ChatGPT, and also explains where it breaks.