CNI Seminar Series

When Privacy meets Partial Information: Privacy-Utility Trade-offs in Bandits

Prof. Debabrata Basu, Starting Faculty, INRIA France

#258

Abstract

Bandits act as an archetypal model of sequential learning, where one has limited information regarding the utilities of a set of decisions and can know more about the utility of a decision only by choosing it. The goal of a bandit algorithm is either (a) to maximise the total accumulated utility over a given number of interactions, or (b) to find the decision with maximal utility through minimal number of interactions. As bandits are progressively used for data-sensitive applications, such as designing adaptive clinical trials, tuning hyper-parameters, recommender systems etc., it is imperative to ensure data privacy of these algorithms. Motivated by this concern, we study the impact of preserving Differential Privacy in bandits with different goals (both (a) and (b)). We answer three questions: i. How to define Differential Privacy in bandits as both the input and output are generated progressively through past data-driven interactions? ii. What are the changes in the fundamental hardness of bandits problems (both (a) and (b)) if we ensure ε-Differential Privacy? iii. How to modify existing bandit algorithms (both (a) and (b)) to simultaneously ensure ε-Differential Privacy and achieve optimal performance? Our study yields new information-theoretic quantities and a generic algorithm demonstrating that in most of the cases, ε-Differential Privacy can be achieved almost for free in bandits. The talk is based on the works: https://arxiv.org/abs/2209.02570, https://arxiv.org/abs/2309.02202, and https://arxiv.org/abs/2505.05613.


Bio
Prof. Debabrata Basu, Starting Faculty, INRIA France

Debabrota Basu holds the Inria Starting Faculty Position in the Scool team (previously Sequel) of Inria Centre at Université de Lille and CNRS, France. He completed his PhD in Computer Science from National University of Singapore (NUS) and École Normale Supérieure (ENS), Paris. Before arriving in Lille, he has been a postdoctoral researcher at Chalmers University of Technology, Sweden, and visiting researcher at Harvard University. His research interest is to develop algorithms and analysis leading to theoretically-grounded responsible AI systems. Specially, he studies how to develop robust, private, fair, and explainable algorithms for online learning, bandits, and reinforcement learning problems. In 2022, ANR endowed the young researcher (JCJC) award for his works on responsible AI. In 2024, he got elected as a scholar of European Learning and Intelligent Systems Society (ELLIS). His work on studying collective meritocracy in college admissions obtained Best Paper with Student presenter award at ACM EAAMO 2022. In IJCAI'23, he presented the tutorial on "Auditing Bias of Machine Learning Algorithms: Tools and Overview".