Traffic Anomaly Detection Based on the IP Size Distribution
Venue
INFOCOM International Conference on Computer Communications, Joint Conference of the IEEE Computer and Communications Societies, IEEE (2012), pp. 2005-2013
Publication Year
2012
Authors
Fabio Soldo, Ahmed Metwally
BibTeX
Abstract
In this paper we present a data-driven framework for detecting machine-generated
traffic based on the IP size, i.e., the number of users sharing the same source IP.
Our main observation is that diverse machine-generated traffic attacks share a
common characteristic: they induce an anomalous deviation from the expected IP size
distribution. We develop a principled framework that automatically detects and
classifies these deviations using statistical tests and ensemble learning. We
evaluate our approach on a massive dataset collected at Google for 90 consecutive
days. We argue that our approach combines desirable characteristics: it can
accurately detect fraudulent machine-generated traffic; it is based on a
fundamental characteristic of these attacks and is thus robust (e.g., to DHCP
re-assignment) and hard to evade; it has low complexity and is easy to parallelize,
making it suitable for large-scale detection; and finally, it does not entail
profiling users, but leverages only aggregate statistics of network traffic.
