Richard S. Sutton and Andrew G. Barto
Summary: Reinforcement Learning: An Introduction (2nd Edition, 2018)
Richard S. Sutton and Andrew G. Barto's foundational text provides a comprehensive introduction to reinforcement learning (RL), where agents learn optimal behavior through trial-and-error interaction with their environment to maximize cumulative rewards.
Core Framework
- Agent-Environment Interaction: Continuous loop of states, actions, and rewards
- Markov Decision Processes (MDPs): Mathematical framework with states, actions, transition probabilities, and reward functions
- Exploration vs. Exploitation: Fundamental challenge of trying new actions versus using known good ones
Key Solution Methods
- Dynamic Programming: Optimal solutions when environment model is known (policy iteration, value iteration)
- Monte Carlo Methods: Model-free learning from complete episodes by averaging returns
- Temporal-Difference Learning: Online learning that bootstraps from current estimates (SARSA, Q-learning)
- Eligibility Traces: Unifies TD and MC methods for faster credit assignment
- Policy Gradient Methods: Direct optimization of parameterized policies (REINFORCE, actor-critic)
Advanced Topics
- Function Approximation: Using neural networks to handle large/continuous state spaces
- Deep Reinforcement Learning: Combining RL with deep learning for complex tasks like game-playing and robotics
The book establishes that properly designed reward signals can drive the emergence of intelligent behavior, making RL a powerful framework for sequential decision-making problems across artificial intelligence and beyond.
The app will open automatically. If it doesn't, tap “Open in 900s App”.