Reinforcement Learning: An Introduction

Reinforcement Learning: An Introduction

Richard S. Sutton and Andrew G. Barto

Summary: Reinforcement Learning: An Introduction (2nd Edition, 2018)

Richard S. Sutton and Andrew G. Barto's foundational text provides a comprehensive introduction to reinforcement learning (RL), where agents learn optimal behavior through trial-and-error interaction with their environment to maximize cumulative rewards.

Core Framework

  • Agent-Environment Interaction: Continuous loop of states, actions, and rewards
  • Markov Decision Processes (MDPs): Mathematical framework with states, actions, transition probabilities, and reward functions
  • Exploration vs. Exploitation: Fundamental challenge of trying new actions versus using known good ones

Key Solution Methods

  • Dynamic Programming: Optimal solutions when environment model is known (policy iteration, value iteration)
  • Monte Carlo Methods: Model-free learning from complete episodes by averaging returns
  • Temporal-Difference Learning: Online learning that bootstraps from current estimates (SARSA, Q-learning)
  • Eligibility Traces: Unifies TD and MC methods for faster credit assignment
  • Policy Gradient Methods: Direct optimization of parameterized policies (REINFORCE, actor-critic)

Advanced Topics

  • Function Approximation: Using neural networks to handle large/continuous state spaces
  • Deep Reinforcement Learning: Combining RL with deep learning for complex tasks like game-playing and robotics

The book establishes that properly designed reward signals can drive the emergence of intelligent behavior, making RL a powerful framework for sequential decision-making problems across artificial intelligence and beyond.

Back to Home

The app will open automatically. If it doesn't, tap “Open in 900s App”.

Reinforcement Learning: An Introduction — Richard S. Sutton and Andrew G. Barto · 900s