Jump to Content

Traffic Anomaly Detection Based on the IP Size Distribution

Fabio Soldo
Ahmed Metwally
INFOCOM International Conference on Computer Communications, Joint Conference of the IEEE Computer and Communications Societies, IEEE (2012), pp. 2005-2013

Abstract

In this paper we present a data-driven framework for detecting machine-generated traffic based on the IP size, i.e., the number of users sharing the same source IP. Our main observation is that diverse machine-generated traffic attacks share a common characteristic: they induce an anomalous deviation from the expected IP size distribution. We develop a principled framework that automatically detects and classifies these deviations using statistical tests and ensemble learning. We evaluate our approach on a massive dataset collected at Google for 90 consecutive days. We argue that our approach combines desirable characteristics: it can accurately detect fraudulent machine-generated traffic; it is based on a fundamental characteristic of these attacks and is thus robust (e.g., to DHCP re-assignment) and hard to evade; it has low complexity and is easy to parallelize, making it suitable for large-scale detection; and finally, it does not entail profiling users, but leverages only aggregate statistics of network traffic.

Research Areas