2MMS50 – Stochastic Decision Theory

This page was last updated for the Academic Year 2020-2021.

Aims / Summary

The goal of the course is to familiarize students with the mathematical concepts and computational techniques for stochastic decision and optimization problems, and illustrate the application of these methods in various scenarios. The methodological framework of Markov decision processes and stochastic dynamic programming models will play a central role, and the students are expected to obtain knowledge of the main problem formulations and be able to apply the the main computational approaches in that domain to stylized problems. 

You will learn of state-of-the-art approaches in:

  1. Markov Decision Theory
  2. Multi-Armed Bandits
  3. Reinforcement Learning
  4. Stochastic / Distributed / Robust Optimization

Lecturers / Contact

This course will be taught by prof.dr. Bert Zwart (TU/eCWI) and dr. Jaron Sanders (TU/epersonal). Mike van Santvoort is our instructor.

If you have a question, then you can contact us in the following ways:

  1. Ask questions during the live videolectures on Microsoft Teams (Links to an external site.).
  2. Post your question in the Discussions forum. This will benefit other students too.
  3. Send us an e-mail if it’s low priority.

Course material

We will be using the following material throughout the course:

If you want additional background, we recommend you these books:

We encourage you to examine these seminal papers, whose contributions will come by:

  • William Thompson, On the likelihood that one unknown probability exceeds another in view of the evidence of two samples, Biometrika 25(3-4):285-294, 1933
  • Robbins, Some aspects of the sequential design of experiments, Bulletin of the American Mathematical Society. 58(5):527-535, 1952
  • Gittins, Bandit Processes and Dynamic Allocation Indices, Journal of the Royal Statistical Society, 1979

Exam / Grading

In order to pass this course, you must hand in four homework assignments. We ask you to partner up and deliver your work in student groups of size two. Each homework assignment will cover one of the topics of the course, and we will hand one out after every 3rd lecture. You will then have two weeks to complete each assignment.

We will grade each of the four assignments. Each will count for 25% towards your final grade.  

Online lectures

All lectures and instructions will be taught digitally, using Microsoft Teams. We have created a team for 2MMS50. We will organize live sessions in the General channel. 

We will be using Powerpoint slides predominantly, and at times, may use Microsoft Whiteboard. Be on time and present, digitally, and participate. We encourage you strongly to enable your camera and have a functional microphone for interactivity.

Course overview

Here is a week-by-week breakdown of the course:

WeekTopicProfessor ActivityDate
16Markov Decision TheoryBert Zwarttwo lecturesApril 19, 22
17Markov Decision TheoryBert Zwartone lectureApril 29
18Markov Decision Theory
New assignment: MDT, May 3rd
Bert Zwartone instruction May 3
Multi-Armed BanditsJaron Sandersmixed lecture, instructionMay 6
19Multi-Armed BanditsJaron Sandersmixed lecture, instructionMay 10
20Multi-Armed Bandits
Deadline: MDT, May 19th
New assignment: MABs, May 20th
Jaron Sandersmixed lectures, instructionsMay 17, 20
21Reinforcement LearningJaron Sandersmixed lectures, instructionsMay 24, 27
22Reinforcement Learning
Deadline: MABs, June 2nd
New assignment: RL, June 3rd
Jaron Sandersmixed lectures, instructionsMay 31, June 3
23Stochastic / Distributed / Robust OptimizationBert Zwarttwo lecturesJune 7, 10
24Stochastic / Distributed / Robust Optimization
Deadline: RL, June 16th
New assignment: SDRO, June 17th
Bert Zwartone lecture, one instructionJune 14, 17
25Deadline: SDRO,  June 30th

Presentations’ slides

Here are my presentations’ slides of academic year 2019-2020:

Multi-Armed Bandits

Loader Loading…
EAD Logo Taking too long?

Reload Reload document
| Open Open in new tab


Reinforcement Learning

Loader Loading…
EAD Logo Taking too long?

Reload Reload document
| Open Open in new tab