SEMINAR

Faster saddle-point optimization for solving large-scale Markov decision processes

Speaker

Joan Bas Serrano

Working
Universitat Pompeu Fabra
Timeline
Fri, Feb 21 2020 - 10:00 am (GMT + 7)
About Speaker

Joan Bas Serrano is a PhD student and teaching assistant in the Artificial Intelligence and Machine Learning group at Universitat Pompeu Fabra (UPF). Under the supervision of Gergely Neu, he’s currently working on theoretical reinforcement learning. He is mainly interested in the development of reinforcement learning algorithms with a solid mathematical basis and strong theoretical guarantees. His current research is based on the linear programming formulation of MDPs and the saddle-point optimization theory.

He obtained his bachelor’s in Physics at Universitat Autònoma de Barcelona in 2016 and his M.Sc. at UPF in 2017, cursing the program “Master in Intelligent and Interactive Systems”. He did his master thesis titled “ A generative model of user activity in the Integrated Learning Design Environment“ under the advice of Vicenç Gomez and Davinia Hernandez-Leo.

Abstract

We consider the problem of computing optimal policies in average-reward Markov decision processes. This classical problem can be formulated as a linear program directly amenable to saddle-point optimization methods, albeit with a number of variables that is linear in the number of states. To address this issue, recent work has considered a linearly relaxed version of the resulting saddle-point problem. Our work aims at achieving a better understanding of this relaxed optimization problem by characterizing the conditions necessary for convergence to the optimal policy, and designing an optimization algorithm enjoying fast convergence rates that are independent of the size of the state space. Notably, our characterization points out some potential issues with previous work.

Related seminars

Tim Baldwin

MBZUAI, The University of Melbourne

Safe, open, locally-aligned language models
Mon, Dec 16 2024 - 02:00 pm (GMT + 7)

Alessio Del Bue

Italian Institute of Technology (IIT)

From Spatial AI to Embodied AI: The Path to Autonomous Systems
Mon, Dec 16 2024 - 10:00 am (GMT + 7)

Dr. Xiaoming Liu

Michigan State University

Person Recognition at a Distance
Mon, Dec 9 2024 - 10:00 am (GMT + 7)

Dr Lan Du

Monash University

Uncertainty Estimation for Multi-view/Multimodal Data
Fri, Dec 6 2024 - 10:00 am (GMT + 7)