← Blog

Reinforcement Learning

ML

/dictionary/reinforcement-learning

Definition

ML where an agent takes actions, gets rewards, and learns a policy that maximises long-run reward. Behind AlphaGo and a key part of how modern LLMs are tuned to be helpful (RLHF).

Posts that use this term