Videos

02456week6 1 1 reinforcement learning

02456week6 1 2 reinforcement learning approaches

02456week6 2 1 AlphaGo policy and value networks

02456week6 2 2 AlphaGo steps 1 to 4

02456week6 3 policy gradients

02456week6 4 a few last words

Keywords

State

Action

Agent

Policy

Reward

Policy Gradients (PGs) vs Q-Learning (DQN)

Monte Carlo Tree Search

Deep Neural Network

Markov Decision Process (MDP)

Policy Network