TIMELY: RTT-based Congestion Control for the Datacenter
Venue
Sigcomm '15, Google Inc (2015)
Publication Year
2015
Authors
Radhika Mittal, Terry Lam, Nandita Dukkipati, Emily Blem, Hassan Wassel, Monia Ghobadi, Amin Vahdat, Yaogong Wang, David Wetherall, David Zats
BibTeX
Abstract
Datacenter transports aim to deliver low latency messaging together with high
throughput. We show that simple packet delay, measured as round-trip times at
hosts, is an effective congestion signal without the need for switch feedback.
First, we show that advances in NIC hardware have made RTT measurement possible
with microsecond accuracy, and that these RTTs are sufficient to estimate switch
queueing. Then we describe how TIMELY can adjust transmission rates using RTT
gradients to keep packet latency low while delivering high bandwidth. We implement
our design in host software running over NICs with OS-bypass capabilities. We show
using experiments with up to hundreds of machines on a Clos network topology that
it provides excellent performance: turning on TIMELY for OS-bypass messaging over a
fabric with PFC lowers 99 percentile tail latency by 9X while maintaining near
line-rate throughput. Our system also outperforms DCTCP running in an optimized
kernel, reducing tail latency by 13X. To the best of our knowledge, TIMELY is the
first delay-based congestion control protocol for use in the datacenter, and it
achieves its results despite having an order of magnitude fewer RTT signals (due to
NIC offload) than earlier delay-based schemes such as Vegas.
