Monte Carlo methods are ways of solving the reinforcement learning problem based on averaging sample returns

General policy iteration (GPI)

Untitled

Monte carlo prediction

First visit mc method

Untitled

Every visit mc method

gamma lowers over time

Untitled

Every visit has bias