Learning while bidding in Vickrey Auctions with acquisition rate and targeting constraints

# 180






Abstract

Ad placement in web-browsing and wireless mobiles is an increasingly important component of the advertisement market. The market size is over 100 billion dollar and counting. The mechanism is as follows; when a user opens a webpage or mobile ap a message is sent to an exchange where multiple bidders have the possibility of placing an ad that would target the right user, e.g. age, sex, location, etc. The ad that is displayed corresponds to the bidder who bids the highest while the cost is calculated according to a first or second price. Typically, bidders are DSP (Demand Side Platforms) that aggregate bids on behalf of clients who wish to run a campaign for a given length of time with certain targeting criteria. The goal is to minimize the total cost of obtaining the required number of impressions (ads that meet targeting criteria) over the duration of a contract. The real time constraint is that bidding must be done within 100ms. In this talk I will present the theory developed for computing the least cost bids in the second price or Vickrey auction context. This involves the notion of an information state for the problem. There is a very rich primal-dual theory that emerges, one in the so called impressions space and the other in the contracts space. Computationally and structurally the primal and dual views of the optimization have different properties that can be exploited to come up with fast algorithms. The optimal solutions depend on solving a constrained convex optimization problem when the information state is known. However this is not readily available and thus there is the problem of learning the information state. We show that in the second price case, stochastic approximation (SA) algorithms that operate on censored data (prices are only known by a bidder when the bidder wins) can be devised that solve the constrained optimization problem without learning the information state explicitly and we prove their convergence. Finally I will present the dynamic behaviour through simulations. Joint work with Ryan Kinnear (OpenAI), Amirhossein Boreiri (waterloo) and Peter Marbach (Toronto). We thank Addictive Mobility Inc., a Pelmorex company for having proposed the problem and to Addictive Mobility, Ontario OCE VIP II, and NSERC funding the work.

Ravi Mazumdar, Professor, University of Waterloo

Prof. Mazumder received his B.Tech from the Indian Institute of Technology, Bombay in 1977, MSc, DIC from Imperial College, London in 1978 and obtained his PhD in Control Theory under A. V. Balakrishnan at UCLA in 1983. He is currently a University Research Chair Professor in the Dept. of ECE at the University of Waterloo, Ont., Canada where he has been since September 2004. Prior to this he was Professor of ECE at Purdue University, West Lafayette, USA. Since 2012 he is a D.J. Gandhi Distinguished Visiting Professor at the Indian Institute of Technology, Bombay, India and since May 2019 an Adjunct Professor at the Tata Institute of Fundamental Research (TIFR), Mumbai. He is a Fellow of the IEEE, the Royal Statistical Society, and the Asia-Pacific Artificial Intelligence Association. He is a recipient of the INFOCOM 2006 Best Paper Award, the ITC-27 2015 Best Paper Award, the Performance 2015 Best Paper Award and was runner-up for the Best Paper Award at INFOCOM 1998. His research interests are in stochastic modelling and analysis applied to complex networks and statistical inference.