Moving Beyond End-to-End Path Information to Optimize CDN Performance
Venue
Internet Measurement Conference (IMC), ACM, Chicago, IL (2009), pp. 190-201
Publication Year
2009
Authors
Rupa Krishnan, Harsha V. Madhyastha, Sushant Jain, Sridhar Srinivasan, Arvind Krishnamurthy, Thomas Anderson, Jie Gao
BibTeX
Abstract
Replicating content across a geographically distributed set of servers and
redirecting clients to the closest server in terms of latency has emerged as a
common paradigm for improving client performance. In this paper, we analyze
latencies measured from servers in Google’s content distribution network (CDN) to
clients all across the Internet to study the effectiveness of latency-based server
selection. Our main result is that redirecting every client to the server with
least latency does not suffice to optimize client latencies. First, even though
most clients are served by a geographically nearby CDN node, a sizeable fraction of
clients experience latencies several tens of milliseconds higher than other clients
in the same region. Second, we find that queueing delays often override the
benefits of a client interacting with a nearby server. To help the administrators
of Google’s CDN cope with these problems, we have built a system called WhyHigh.
First, WhyHigh measures client latencies across all nodes in the CDN and correlates
measurements to identify the prefixes affected by inflated latencies. Second, since
clients in several thousand prefixes have poor latencies, WhyHigh prioritizes
problems based on the impact that solving them would have, e.g., by identifying
either an AS path common to several inflated prefixes or a CDN node where path
inflation is widespread. Finally, WhyHigh diagnoses the causes for inflated
latencies using active measurements such as traceroutes and pings, in combination
with datasets such as BGP paths and flow records. Typical causes discovered include
lack of peering, routing misconfigurations, and side-effects of traffic
engineering. We have used WhyHigh to diagnose several instances of inflated
latencies, and our efforts over the course of a year have significantly helped
improve the performance offered to clients by Google’s CDN.
An anonymized data set is available for download.
