ECE 586: Markov Decision Processes and Reinforcement Learning (Spring 2019)
Scribing Template
Use this .tex file for scribing. Output should look like this.
Additional Reading
Title | Author | File | Final Exam Inclusion |
Overview | Dimitrios Katselis | .pdf | included
|
Stochastic Approximation | Dimitrios Katselis | .pdf | not included
|
ODE Appendix | Dimitrios Katselis | .pdf | not included
|
HMMs, POMDPs and Linear Quadratic Regulation | Dimitrios Katselis | .pdf | not included
|
Lecture Notes
Note: Files with status "not checked" correspond to course notes documents, exactly as they were submitted by the scribing team. These files
will be gradually corrected (if necessary) by me and Joseph. The status will then change to "checked". Any posterior
corrections on your scribed files will not affect the grade that you received for scribing.
Final Exam Material: The material for the final exam is up to and including the Law of Large Numbers. Some of the remaining files may be "checked"
after the end of the semester.
Title | File | Scribing | Status |
Markov Chains 1 | lec1 | Zeyu Zhou, Lucas Buccafusca
| checked |
Markov Chains 2 | lec2 | Andrew Chen, Zih-Siou Hung
| checked |
Steepest and Gradient Descent 1 | lec3 | Duc Phan, Alireza Moradzadeh
| checked |
Steepest and Gradient Descent 2 | lec4 | William Wei, Aditya Deshmukh
| checked |
Gradient Projection and Stochastic Gradient Descent 1 | lec5 | Xingyu Bai, Tiancheng Zhao
| checked |
Stochastic Gradient Descent 2 | lec6 | Kunhao Li, Zhikai Guo
| checked |
Neural Networks | lec7 | Alireza Moradzadeh, Cathy Shih
| not checked |
Multi-armed bandits | lec8 | Brando Miranda, Hanwen Hu
| checked |
Discounted Cost MDPs, Value and Policy Iteration, Monotone Policies | lec9 | Tianhao Wu, Xiaoyang Bai, Junchi Yang, Shen Yan
| checked |
Q-Learning, Function Approximation, Temporal Difference Learning | lec10 | Zeyu Zhou, Lucas Buccafusca, William Wei, Cathy Shih, Kunhao Li, Zhikai Guo, Duc Phan
| checked |
Law of Large Numbers | lec11 | Andrew Chen, Zih-Siou Hung
| not checked |
Average Cost MDPs | lec12 | Xingyu Bai, Tiancheng Zhao
| not checked |
Average Cost Q-Learning, Linear Programming and Constrained MDPs | - | -
| not scribed |
Policy Gradient Part I | lec14 | Xiaoyang Bai, Junchi Yang
| not checked |
Useful Resources
The following links provide useful material (papers, books, implementations of algorithms) for Reinforcement Learning.
Awesome Reinforcement Learning
Spinning Up in Deep RL
|