Reinforcement Learning Short Course
reinforcement-learning
q-learning
ridesharing
policy-gradient
dynamic-programming
deep-q-network
markov-decision-processes
policy-iteration
value-iteration
monte-carlo-methods
temporal-differencing-learning
model-based-rl
policy-based-method
fitted-q-iteration
off-policy-evaluation
offline-rl
order-dispatch-recommendation
-
Updated
Nov 4, 2024 - Jupyter Notebook