区域: / modulnr。:部门数学 / CIT413036课程结构:讲座:2H练习:2H内容:课程概述了增强学习的数学基础,包括对马克夫决策过程的介绍和表图形的增强性增强学习方法(Monte Carlo,Monte Carlo,时间差异,SARSA,SARSA,SARSA,Q-LEAL,Q-LEARNINGNING,...)。这些主题是通过对随机近似理论的影响来补充的,以对算法进行收敛分析。Prerequisite: MA0001 Analysis 1, MA0002 Analysis 2, MA0004 Linear Algebra 1, MA0009 Introduction to Probability Theory and Statistics, MA2409 Probability Theory Literature : Sutton, Barto (2018): Reinforcement Learning: An Introduction, MIT Press Puterman (1994): Markov Decision Processes: Discrete Stochastic Dynamic Programming, Wiley Kushner, Yin (2010): Stochastic近似和递归算法和应用,施普林格证书:请参阅Tumonline位置/讲座/练习:请参阅Tumonline
主要关键词