Skip to main content

Reinforcement Learning Fundamentals

Course Code: MSF-0057

Enrollment in this course is by invitation only

About this course

Reinforcement Learning (RL) is an area of machine learning, where an agent learns by interacting with its environment to achieve a goal.

In this course, you will be introduced to the world of reinforcement learning. You will learn how to frame reinforcement learning problems and start tackling classic examples like news recommendation, learning to navigate in a grid-world, and balancing a cart-pole.

You will explore the basic algorithms from multi-armed bandits, dynamic programming, TD (temporal difference) learning, and progress towards larger state space using function approximation, in particular using deep learning. You will also learn about algorithms that focus on searching the best policy with policy gradient and actor critic methods. Along the way, you will get introduced to Project Malmo, a platform for Artificial Intelligence experimentation and research built on top of the Minecraft game.

What you'll learn

  • Reinforcement Learning Problem
  • Markov Decision Process
  • Bandits
  • Dynamic Programming
  • Temporal Difference Learning
  • Approximate Solution Methods
  • Policy Gradient and Actor Critic
  • RL that Works

Meet the instructors

Jonathan Sanito

Jonathan Sanito

Senior Content Developer
Microsoft

Jonathan works as a content developer and project manager for Microsoft focusing in Data and Analytics online training. He has worked with trainings for developer and IT pro audiences, from Microsoft Dynamics NAV to Windows Active Directory. Before coming to Microsoft, Jonathan worked as a consultant for a Microsoft partner, implementing Microsoft Dynamics NAV solutions.

Roland Fernandez

Roland Fernandez

Senior Researcher and AI School Instructor, Deep Learning Technology Center
Microsoft Research AI

Roland works as a researcher and AI School instructor in the Deep Learning Technology Center of Microsoft Research AI. His interests include reinforcement learning, autonomous multitask learning, symbolic representation, AI education, information visualization, and HCI. Before coming to the DLTC, Roland worked in the VIBE group of MSR doing visualization and HCI projects, most notably the SandDance project. Before MSR, Roland worked (at Microsoft and other companies) in the areas of Natural User Interfaces, Activity Based Computing, Advanced Prototyping, Programmer Tools, Operating Systems, and Databases.

Adith Swaminathan

Adith Swaminathan

Researcher
Microsoft Research AI

Adith is a researcher at the Deep Learning Technology Center at Microsoft Research. He studies principles and algorithms that can improve human-centered systems using machine learning. Adith spent the 2015-16 academic year visiting the Information and Language Processing Systems group at the University of Amsterdam, interned with the Machine Learning group at Microsoft Research NYC during the summer of 2015, Computer Human Interactive Learning group (now called Machine Teaching Group) at Microsoft Research Redmond during the summer of 2013, Search Labs at Microsoft Research during the summer of 2012, and worked as a strategist with Tower Research Capital for 14 months from June 2010 – July 2011.

Kenneth Tran

Kenneth Tran

Principal Research Engineer
Microsoft Research AI

Kenneth is a Principal Research Engineer at the Deep Learning Technology Center. He has wide interest in Machine Learning spanning from optimization algorithms to distributed systems. His current main research pursuit is deep reinforcement learning with focus on off-policy learning and sample efficient methods, safe exploration, reverse reinforcement learning and real-world optimal control applications, including drones control, data center energy optimization, indoor farming optimization, etc.

Katja Hofmann

Katja Hofmann

Researcher
Microsoft Research AI

Katja is a researcher at the Machine Intelligence and Perception group at Microsoft Research Cambridge. She is the research lead of Project Malmo, which uses the popular game Minecraft as an experimentation platform for developing intelligent technology. Her long-term goal is to develop AI systems that learn to collaborate with people, to empower their users and help solve complex real-world problems. Outside of Project Malmo, Katja works on online evaluation and interactive learning for information retrieval, which means understanding how we can apply machine learning an artificial intelligence to develop more intelligent search and recommendation systems.

Matthew Hausknecht

Matthew Hausknecht

Researcher
Microsoft Research AI

Matthew is a researcher at Microsoft Research. His interests involve expanding the capabilities of intelligent agents. His main research is at the intersection of Reinforcement Learning and Deep Learning. Matthew received his PhD from the University of Texas at Austin under the supervision of Peter Stone.


Enrollment in this course is by invitation only

Course Details

LUNA TECHNOLOGIES INC. 2022. All rights reserved.