TD(λ) Points from summary

$\lambda$ refers to

Eligibility traces provides a way of choosing between MC and TD methods

Use Eligibility traces if a task is partially non Markov, but you want to use TD for their advantages

By adjusting lambda the e-trace can be placed anywhere from MC to one step TD.

Methods using e-traces require more computation than one-step, but offer faster learning.

Makes sense when data is scarce

The eligibility trace is a fairly simple mechanism which just keeps track of previous states visited by the learning method. T

Backups

Untitled

Untitled

Untitled

Untitled