MapReduce/Bigtable for Distributed Optimization
Abstract: For large data it can be very time consuming to run gradient
based optimizat ion,for example to minimize the log-likelihood for maximum entropy
models.Distributed methods are therefore appealing and a number of distributed
gradientoptimization strategies have been proposed including: distributed gradient,
asynchronousupdates, and iterative parameter mixtures. In this paper, we evaluatethese
various strategies with regards to their accuracy and speed over MapReduce/Bigtable and
discuss the techniques needed for high performance.