← Blog
RLHF
ML/dictionary/rlhf
Definition
The training stage where a pre-trained LLM is tuned with human preferences (people rank the models outputs, the model learns to produce the ones humans prefer). Turns a raw text predictor into a useful assistant.
Posts that use this term
- From models to LLMs
An LLM is one kind of ML model — trained on text, predicts the next token. That single trick at scale gets you ChatGPT, and also explains where it breaks.