Average Reward Markov Decision Process

# 199


In this talk, we delve into an infinite horizon average reward Markov Decision Process (MDP). Setting ourselves apart from prior research in this field, our methodology delves into regret guarantees with a general parameterization, specifically focusing on policy gradient-based algorithms. We elucidate the pivotal insights underlying our gradient estimation approach, culminating in a regret bound of O(T^0.75). Further, efficient momentum based approach will be proposed that achieves a regret bound of O(T^0.5). Finally, we propose a technique to alleviate the dependence on mixing time for this problem.

Vaneet Aggarwal, Purdue University.

Vaneet Aggarwal received the B.Tech. degree in 2005 from Indian Institute of Technology, Kanpur, India and the M.A. and Ph.D. degrees in 2007 and 2010, respectively from Princeton University, Princeton, NJ, USA, all in Electrical Engineering. He is currently a University Faculty Scholar and Professor in the School of Industrial Engineering, the School of Electrical and Computer Engineering, and the Department of Computer Science (by courtesy) at Purdue University, where he has been since Jan 2015. Prior to this, he worked as a researcher at AT&T Labs-Research, Florham Park, NJ for four and a half years. He was Adjunct Assistant Professor at Columbia University (EE, 2013-2014), VAJRA Adjunct Professor at IISc Bangalore (ECE, 2018-2019), Adjunct Professor at IIIT Delhi (CS, 2022-2023), and Visiting Professor at KAUST, Saudi Arabia (CS, 2022-2023). His research interests are in Reinforcement Learning; Generative AI; Quantum Machine Learning; Applications of ML in Networking, Transportation, Robotics, Manufacturing, Healthcare, and Biomedical. Dr. Aggarwal was the recipient of Princeton University's Porter Ogden Jacobus Honorific Fellowship in 2009 and Purdue University’s Most Impactful Faculty Innovator Award in 2020. He received the 2017 Jack Neubauer Memorial Award recognizing the Best Systems Paper published in the IEEE Transactions on Vehicular Technology and the 2024 IEEE William R. Bennett Prize recognizing the Best Paper published in the IEEE/ACM Transactions on Networking. Further, he received the 2018 IEEE Infocom Workshop Best Paper Award, and the 2021 NeurIPS Workshop Best Paper Award. He was on the Editorial Board of the IEEE Transactions on Green Communications and Networking and the IEEE Transactions on Communications. He is currently serving on the Editorial Board of the IEEE/ACM Transactions on Networking and is co-Editor-in-Chief of the ACM Journal on Transportation Systems. He is PC co-chair for GAMENETS 2024. He is also a 2024-2025 IEEE Comsoc Distinguished Lecturer.