Umar Syed
I received a Ph.D. in Computer Science from Princeton University, where I was advised by Rob Schapire. I spent two years as a postdoctoral researcher at the University of Pennsylvania, hosted by Ben Taskar and Michael Kearns. My research interests are in machine learning.
Research Areas
Authored Publications
Google Publications
Other Publications
Sort By
Pricing a low-regret seller
Hoda Heidari
Sadra Yazdanbod
Proceedings of the Thirty-Third International Conference on Machine Learning (ICML 2016)
Preview abstract
As the number of ad exchanges has grown, publishers have turned to low regret learning algorithms to decide which exchange offers the best price for their inventory. This in turn opens the following question for the exchange: how to set prices to attract as many sellers as possible and maximize revenue. In this work we formulate this precisely as a learning problem, and present algorithms showing that by simply knowing that the counterparty is using a low regret algorithm is enough for the exchange to have its own low regret learning algorithm to find the optimal price.
View details
Learning mobile phone battery consumptions
Ashish Sharma
Paul Eastham
Workshop on On Device Intelligence (2016)
Preview abstract
We introduce a novel, data-driven way for predicting battery consumption of apps. The state-of-the-art models used to blame battery consumption on apps are based on micro-benchmark experiments. These experiments are carried out on controlled setups where one can measure how much battery is consumed by each internal resource (CPU, bluetooth, WiFi...). The battery blame allocated to an app is simply the sum of the blames of the resources consumed by the app. We argue that this type of models do not capture the way phones work "in the wild" and propose instead to train a regression model using data collected from logs. We show that this type of learning is correct in the sense that under some assumptions, we can recover the true battery discharge rate of each component. We present experimental results where we consistently do better predictions than a model trained on micro-benchmarks.
View details
Where to sell: Simulating auctions from learning algorithms
Hamid Nazerzadeh
Proceedings of the Seventeenth ACM Conference on Economics and Computation (EC2016)
Preview abstract
Ad Exchange platforms connect online publishers and advertisers and facilitate selling billions of impressions every day. We study these environments from the perspective of a publisher who wants to find the profit maximizing exchange to sell his inventory. Ideally, the publisher would run an auction among exchanges. However, this is not possible due to technological and other practical considerations. The publisher needs to send each impression to one of the exchanges with an asking price. We model the problem as a variation of multi-armed bandits where exchanges (arms) can behave strategically in order to maximizes their own profit. We propose mechanisms that find the best exchange with sub-linear regret and have desirable incentive properties.
View details
Structural maxent models
Vitaly Kuznetsov
Proceedings of the Thirty-Second International Conference on Machine Learning (ICML 2015)
Preview abstract
We present a new class of density estimation
models, Structural Maxent models, with feature
functions selected from a union of possibly
very complex sub-families and yet benefiting
from strong learning guarantees. The design
of our models is based on a new principle
supported by uniform convergence bounds and
taking into consideration the complexity of the
different sub-families composing the full set of
features. We prove new data-dependent learning
bounds for our models, expressed in terms of the
Rademacher complexities of these sub-families.
We also prove a duality theorem, which we use
to derive our Structural Maxent algorithm. We
give a full description of our algorithm, including
the details of its derivation, and report the results
of several experiments demonstrating that its performance
improves on that of existing L1-norm
regularized Maxent algorithms. We further similarly
define conditional Structural Maxent models
for multi-class classification problems. These
are conditional probability models also making
use of a union of possibly complex feature subfamilies.
We prove a duality theorem for these
models as well, which reveals their connection
with existing binary and multi-class deep boosting
algorithms.
View details
An optimal online algorithm for retrieving heavily perturbed statistical databases in the low-dimensional querying model
Proceedings of the Twenty-Fourth ACM International Conference on Information and Knowledge Management (CIKM 2015)
Preview abstract
We give the first O(1/sqrt{T})-error online algorithm for reconstructing noisy statistical databases, where T is the number of (online) sample queries received. The algorithm is optimal up to the poly(log(T)) factor in terms of the error and requires only O(log T) memory. It aims to learn a hidden database-vector w* in R^d in order to accurately answer a stream of queries regarding the hidden database, which arrive in an online fashion from some unknown distribution D. We assume the distribution D is defined on the neighborhood of a low-dimensional manifold. The presented algorithm runs in O(dD)-time per query, where d is the dimensionality of the query-space. Contrary to the classical setting, there is no separate training set that is used by the algorithm to learn the database —- the stream on which the algorithm will be evaluated must also be used to learn the database-vector. The algorithm only has access to a binary oracle O that answers whether a particular linear function of the database-vector plus random noise is larger than a threshold, which is specified by the algorithm. We note that we allow for a significant O(D) amount of noise to be added while other works focused on the low noise o(sqrt{D}) setting. For a stream of T queries our algorithm achieves an average error O(1/sqrt{T}) by filtering out random noise, adapting threshold values given to the oracle based on its previous answers and, as a consequence, recovering with high precision a projection of a database-vector w* onto the manifold defining the query-space. Our algorithm may be also applied in the adversarial machine learning context to compromise machine learning engines by heavily exploiting the vulnerabilities of the systems that output only binary signal and in the presence of significant noise.
View details
Learning with Deep Cascades
Proceedings of the Twenty-Sixth International Conference on Algorithmic Learning Theory (ALT 2015)
Preview abstract
We introduce a broad learning model formed by cascades of predictors, Deep Cascades, that is structured as general decision trees in which leaf predictors or node questions may be members of rich function families. We present new detailed data-dependent theoretical guarantees for learning with Deep Cascades with complex leaf predictors or node question in terms of the Rademacher complexities of the sub-families composing these sets of predictors and the fraction of sample points correctly classified at each leaf. These general guarantees can guide the design of a variety of different algorithms for deep cascade models and we give a detailed description of two such algorithms. Our second algorithm uses as node and leaf classifiers SVM predictors and we report the results of experiments comparing its performance with that of SVM combined with polynomial kernels.
View details
Corporate learning at scale: Lessons from a large online course at Google
Arthur Asuncion
Jac de Haan
Kayur Patel
Lauren Wong
Learning at Scale (2014)
Preview abstract
Google Research recently tested a massive online class model for an internal engineering education program, with machine learning as the topic, that blended theoretical concepts and Google-specific software tool tutorials. The goal of this training was to foster engineering capacity to leverage machine learning tools in future products. The course was delivered both synchronously and asynchronously, and students had the choice between studying independently or participating with a group. Since all students are company employees, unlike most publicly offered MOOCs we can continue to measure the students’ behavioral change long after the course is complete. This paper describes the course, outlines the available data set and presents directions for analysis.
View details
Preview abstract
Motivated by real-time advertising exchanges, we analyze the problem of pricing inventory in a repeated posted-price auction. We consider both the cases of a truthful and surplus-maximizing buyer, where the former makes decisions myopically on every round, and the latter may strategically react to our algorithm, forgoing short-term surplus in order to trick the algorithm into setting better prices in the future. We further assume a buyer’s valuation of a good is a function of a context vector that describes the good being sold. We give the first algorithm attaining sublinear (O(T^{
2/3})) regret in the contextual setting against a surplus-maximizing
buyer. We also extend this result to repeated second-price auctions with multiple buyers.
View details
Deep boosting
Proceedings of the Thirty-First International Conference on Machine Learning (ICML 2014)
Preview abstract
We present a new ensemble learning algorithm, DeepBoost, which can use as base classifiers a hypothesis set containing deep decision trees, or members of other rich or complex families, and succeed in achieving high accuracy without overfitting the data. The key to the success of the algorithm is a capacity-conscious criterion for the selection of the hypotheses. We give new data- dependent learning bounds for convex ensembles expressed in terms of the Rademacher complexities of the sub-families composing the base classifier set, and the mixture weight assigned to each sub-family. Our algorithm directly benefits from these guarantees since it seeks to minimize the corresponding learning bound. We give a full description of our algorithm, including the details of its derivation, and report the results of several experiments showing that its performance compares favorably to that of AdaBoost and Logistic Regression and their L1-regularized variants.
View details
Preview abstract
We present new ensemble learning algorithms for multi-class classification. Our
algorithms can use as a base classifier set a family of deep decision trees or other rich or complex families and yet benefit from strong generalization guarantees. We give new data-dependent learning bounds for convex ensembles in the multiclass classification setting expressed in terms of the Rademacher complexities of the sub-families composing the base classifier set, and the mixture weight assigned to each sub-family. These bounds are finer than existing ones both thanks to an improved dependency on the number of classes and, more crucially, by virtue of a more favorable complexity term expressed as an average of the Rademacher complexities based on the ensemble’s mixture weights. We introduce and discuss several new multi-class ensemble algorithms benefiting from these guarantees, prove positive results for the H-consistency of several of them, and report the results of experiments showing that their performance compares favorably with that of multi-class versions of AdaBoost and Logistic Regression and their L1-regularized counterparts.
View details
Preview abstract
Inspired by real-time ad exchanges for online display advertising, we consider the problem of inferring a buyer’s value for a good when the buyer is repeatedly interacting with the seller through a posted-price mechanism. We model the buyer as a strategic agent, interested in maximizing her long-term surplus, and are interested in optimizing seller revenue. We show conditions under which the seller cannot hope to gain an advantage by learning the buyer’s value – i.e. the buyer can always manipulate the exchange to hide her value. This result is accompanied by a seller algorithm that is able to achieve no-regret when the buyer is unable to incur the short-term costs of such manipulation.
View details
An $\tildeO(\frac1\sqrtT)$-error online algorithm for retrieving heavily perturbated statistical databases in the low-dimensional querying mode
Graphical models for bandit problems
Kareem Amin
Michael Kearns
Uncertainty in Artificial Intelligence: Proceedings of the Twenty-Seventh Conference (UAI 2011)
Bandits, query learning, and the haystack dimension
Kareem Amin
Michael Kearns
Proceedings of the Twenty-Fourth Annual Conference on Learning Theory (COLT 2011)
There's something about MRAI: Timing diversity can exponentially worsen BGP convergence
Jennifer Rexford
Proceedings of the Thirtieth IEEE International Conference on Computer Communications (INFOCOM 2011)
Semi-supervised learning with adversarially missing label information
Private and third-party randomization in risk-sensitive equilibrium concepts
Mickey Brautbar
Michael Kearns
Proceedings of the Twenty-Fourth Conference on Artificial Intelligence (AAAI 2010)
A reduction from apprenticeship learning to classification
Adapting to the shifting intent of search queries
Aleksandrs Slivkins
Nina Mishra
Advances in Neural Information Processing Systems 23 (NIPS 2009)
Apprenticeship learning using linear programming
Michael Bowling
Robert E. Schapire
Proceedings of the Twenty-Fifth International Conference on Machine Learning (ICML 2008)
Enzyme function prediction with interpretable models
Golan Yona
Methods in Molecular Biology: Computational Systems Biology, Humana Press (2008)
Using automatically transcribed dialogs to learn user models in a spoken dialog system
Jason Williams
Proceedings of the Forty-Sixth Annual Meeting of the Association for Computational Linguistics (ACL 2008)
A game-theoretic approach to apprenticeship learning
Imitation learning with a value-based prior
Robert E. Schapire
Uncertainty in Artificial Intelligence: Proceedings of the Twenty-Third Conference (UAI 2007)
Using a mixture of probabilistic decisions trees for direct prediction of protein function
Golan Yona
Proceedings of the Seventh Annual International Conference on Research in Computational Biology (RECOMB 2003)