Machine Learning in an Auction Environment
Venue
Proceedings of the 23rd International Conference on the World Wide Web (WWW) (2014), pp. 7-18
Publication Year
2014
Authors
Patrick Hummel, Preston McAfee
BibTeX
Abstract
We consider a model of repeated online auctions in which an ad with an uncertain
click-through rate faces a random distribution of competing bids in each auction
and there is discounting of payoffs. We formulate the optimal solution to this
explore/exploit problem as a dynamic programming problem and show that efficiency
is maximized by making a bid for each advertiser equal to the advertiser's expected
value for the advertising opportunity plus a term proportional to the variance in
this value divided by the number of impressions the advertiser has received thus
far. We then use this result to illustrate that the value of incorporating active
exploration into a machine learning system in an auction environment is exceedingly
small.
