Wednesday, October 9th, 12:30pm in CS conference room
Abstract: Bandit optimisation problems constitute the most fundamental and basic instances of sequential decision problems with an exploration-exploitation trade-off. They naturally arise in many contemporary applications found in communication networks, e-commerce and recommendation systems. In this lecture, we present recent results on bandit optimisation problems with large strategy sets. For such problems, the number of possible strategies may not be negligible compared to the time horizon. However the average rewards obtained using the various strategies exhibit some structure that may be exploited to speed up the learning process, and in turn, improve performance. We propose simple algorithms, optimally leveraging this structural property, and discuss their applications in communication networks and recommendation systems.
Bio: Alexandre Proutiere is an associate professor of Electrical Engineering at KTH, the Royal Institute of Technology. He received an engineering degree from Ecole Nationale Superieure des Telecoms (Paris) and then, from 1998 to 2000, he worked in the radio communication department at the Ministry of Foreign Affairs in Paris. He received his PhD in Applied Mathematics from Ecole Polytechnique, Palaiseau, France in 2003 under the supervision of James Roberts. Following his PhD he worked as a researcher at Microsoft Research in Cambridge (UK) before joining KTH as an associate professor. Proutiere's research focuses on the design and performance evaluation of computer networks, with a specific interest in resource allocation and control in wireless systems. His work has had significant impact both in the development of theoretical tools and in their practical application. This is highlighted by the "Best Paper" awards he has received from top publication venues such as ACM SIGMETRICS and ACM Mobihoc. Additionally, his work has been recognized by the ACM SIGMETRICS Rising Star award, given for outstanding contributions to computer/communication performance evaluation by a research not more than 7 years from their PhD.