Multi-Objective Reinforcement Learning with Concave Utilities

Most engineering applications have multiple design objectives. In thistalk, we will consider the problem of building a Reinforcement Learning (RL)framework for jointly optimizing multiple objectives, which can be used inmultiple scheduling applications. An example is maximization of fairness amongmultiple agents, which requires balancing the cumulative rewards received byindividual agents, with an optimization objective that is often non-linearacross the agents. With such objective functions, Bellman Optimality no longerholds. Thus, existing RL algorithms aiming at optimizing the (discounted)cumulative reward of all agents fail to address this issue. We formalize theproblem of optimizing a non-linear function of multiple long term averagerewards, to explicitly ensure multi-objective optimization in RL algorithms. Wethen propose model-based and model-free algorithms to learn the optimal policyand discuss regret guarantees. Further, we will discuss the implementation ofour algorithms on scheduling problems and demonstrate that the proposed RLframework can enable multi-objective optimization in these applications withsignificant improvement as compared to standard RL algorithms. Finally, we willdiscuss the impact of constraints in multi-objective reinforcement learning.

Vaneet Aggarwal received the B.Tech. degree from the Indian Institute of Technology, Kanpur,India in 2005, and the M.A. and Ph.D. degrees in 2007 and 2010, respectively from Princeton University, Princeton, NJ, USA, all in Electrical Engineering.He is currently an Associate Professor at Purdue University, West Lafayette,IN, where he has been since Jan 2015. He was a Senior Member of Technical StaffResearch at AT&T Labs-Research, NJ (2010-2014), Adjunct Assistant Professorat Columbia University, NY (2013-2014), and VAJRA Adjunct Professor at IIScBangalore (2018-2019). His current research interests are in machine learning and networking areas. Dr. Aggarwal received Princeton University's Porter Ogden Jacobus Honorific Fellowship in 2009, the AT&T Vice President Excellence Award in 2012, the AT&T Key Contributor Award in 2013, the AT&T Senior VicePresident Excellence Award in 2014, and the Purdue Most Impactful Faculty Innovator in 2020. He received the 2017 Jack Neubauer Memorial Award recognizing the Best Systems Paper published in the IEEE Transactions on Vehicular Technology, and the 2018 Infocom Workshop HotPOST Best Paper Award.He was on the Editorial Board of IEEE Transactions on Green Communications and Networking, and is currently on the Editorial Board of the IEEE Transactions onCommunications and the IEEE/ACM Transactions on Networking.

Multi-Objective Reinforcement Learning with Concave Utilities

Abstract

Vaneet Aggarwal, Purdue University