Reducing Web Latency: the Virtue of Gentle Aggression
Venue
Proceedings of the ACM Conference of the Special Interest Group on Data Communication (SIGCOMM '13), ACM (2013)
Publication Year
2013
Authors
Tobias Flach, Nandita Dukkipati, Andreas Terzis, Barath Raghavan, Neal Cardwell, Yuchung Cheng, Ankur Jain, Shuai Hao, Ethan Katz-Bassett, Ramesh Govindan
BibTeX
Abstract
To serve users quickly, Web service providers build infrastructure closer to
clients and use multi-stage transport connections. Although these changes reduce
client-perceived round-trip times, TCP's current mechanisms fundamentally limit
latency improvements. We performed a measurement study of a large Web service
provider and found that, while connections with no loss complete close to the ideal
latency of one round-trip time, TCP's timeout-driven recovery causes transfers with
loss to take five times longer on average. In this paper, we present the design of
novel loss recovery mechanisms for TCP that judiciously use redundant transmissions
to minimize timeout-driven recovery. Proactive, Reactive, and Corrective are three
qualitatively different, easily-deployable mechanisms that (1) proactively recover
from losses, (2) recover from them as quickly as possible, and (3) reconstruct
packets to mask loss. Crucially, the mechanisms are compatible both with
middleboxes and with TCP's existing congestion control and loss recovery. Our
large-scale experiments on Google's production network that serves billions of
flows demonstrate a 23% decrease in the mean and 47% in 99th percentile latency
over today's TCP.
