Course Information

Found 2 records with CourseId:6329 in academic section database. Picking the first.

Course Name: CS6700 : Reinforcement Learning

Description: The Reinforcement Learning problem : evaluative feedback, non-associative learning, Rewards and returns, Markov Decision Processes, Value functions, optimality and approximation. Dynamic programming : value iteration, policy iteration, asynchronous DP, generalized policy iteration. Monta-Carlo methods : policy evaluation, roll outs, on policy and off policy learning, importance sampling. Temporal Difference learning : TD prediction, Optimality of TD(0), SARSA, Q-learning, R-learning, Games and after states. Eligibility traces : n-step TD prediction, TD (lambda), forward and backward views, Q (lambda), SARSA (lambda), replacing traces and accumulating traces. Function Approximation : Value prediction, gradient descent methods, linear function approximation, ANN based function approximation, lazy learning, instability issues Policy Gradient methods : non-associative learning ? REINFORCE algorithm, exact gradient methods, estimating gradients, approximate policy gradient algorithms, actor-critic methods.

Slot: R

RoomNo: CS26

Instructor: Ravindran B

Period: JUL-NOV 2013

This page was created on: Thursday 19th of September 2013 09:41:21 PM