SEMINAR

Adversarially robust stochastic multi-armed bandits

Speaker

Julian Zimmert

Working
University of Copenhagen
Timeline
Thu, Jan 16 2020 - 10:00 am (GMT + 7)
About Speaker

Julian Zimmert received his Masters degree in Mathematics at the Humboldt University of Berlin and is now a final year PhD student at the University of Copenhagen working under supervision of Yevgeny Seldin. His main area of research is robust algorithms for ranges of environments, in particular algorithms for multi-armed bandits in adversarial and stochastic settings. Recently, he did an internship at DeepMind in the Foundations group of Csaba Szepesvari working on a connection between mirror descent and the information theoretic analysis of Thompson sampling.

Abstract

Multi-armed bandits are a popular framework for optimal experimental design with applications in digital advertising and website optimisation. Traditionally, the bandit literature separates between two distinct forms of environments: The stochastic setting assumes that the data is generated by an i.i.d. process, which allows specialised algorithms to learn quickly. At the other extreme, the adversarial setting only assumes boundedness. This makes learning extremely robust, but comes at the cost of significantly slower convergence to the optimal solution. Real world applications are typically somewhere in between. While it might be reasonable to assume the data is close to i.i.d., the distribution might be influenced by hidden confounders or undergo unforeseen changes. Practically this means that stochastic bandit algorithms might fail even to approach a good solution. This poses a serious dilemma to the practitioners. Should one prioritise fast or robust learning? But why not both? This talk presents a recent breakthrough in practical all-purpose algorithms.

Related seminars

Dr. Tu Vu

Virginia Tech

Efficient Model Development in the Era of Large Language Models
Tue, Nov 5 2024 - 09:30 am (GMT + 7)
Representation Learning with Graph Autoencoders and Applications to Music Recommendation
Fri, Jul 26 2024 - 10:00 am (GMT + 7)

Trieu Trinh

Google Deepmind

AlphaGeometry: Solving IMO Geometry without Human Demonstrations
Fri, Jul 5 2024 - 10:00 am (GMT + 7)

Tat-Jun (TJ) Chin

Adelaide University

Quantum Computing in Computer Vision: A Case Study in Robust Geometric Optimisation
Fri, Jun 7 2024 - 11:00 am (GMT + 7)