image.png

Exploration and bandits 2025

Bellmans equations and exact planning 2025

Monte-carlo methods and TD learning 2025

Model-Free Control with tabular and linear methods 2025

Eligibility traces 2025

Deep Q Learning 2025

Multi armed bandit problem

Policy and value iteration/Markov

Monte Carlo Methods

Temporal Difference learning

Model free control, On policy - prediction/control

project 3 noter

Eligibility Traces and value functions approximations

Q Learning and Deep Q-learning