For large data it can be very time consuming to run gradient based optimizat
ion,for example to minimize the log-likelihood for maximum entropy
models.Distributed methods are therefore appealing and a number of distributed
gradientoptimization strategies have been proposed including: distributed gradient,
asynchronousupdates, and iterative parameter mixtures. In this paper, we
evaluatethese various strategies with regards to their accuracy and speed over
MapReduce/Bigtable and discuss the techniques needed for high performance.