Cloak of Visibility: Detecting When Machines Browse a Different Web
Venue
Proceedings of the 37th IEEE Symposium on Security and Privacy (2016)
Publication Year
2016
Authors
Luca Invernizzi, Kurt Thomas, Alexandros Kapravelos, Oxana Comanescu, Jean-Michel Picod, Elie Bursztein
BibTeX
Abstract
The contentious battle between web services and miscreants involved in blackhat
search engine optimization and malicious advertisements has driven the underground
to develop increasingly sophisticated techniques that hide the true nature of
malicious sites. These web cloaking techniques hinder the effectiveness of security
crawlers and potentially expose Internet users to harmful content. In this work, we
study the spectrum of blackhat cloaking techniques that target browser, network, or
contextual cues to detect organic visitors. As a starting point, we investigate the
capabilities of ten prominent cloaking services marketed within the underground.
This includes a first look at multiple IP blacklists that contain over 50 million
addresses tied to the top five search engines and tens of anti-virus and security
crawlers. We use our findings to develop an anti-cloaking system that detects
split-view content returned to two or more distinct browsing profiles with an
accuracy of 95.5% and a false positive rate of 0.9% when tested on a labeled
dataset of 94,946 URLs. We apply our system to an unlabeled set of 135,577 search
and advertisement URLs keyed on high-risk terms (e.g., luxury products, weight loss
supplements) to characterize the prevalence of threats in the wild and expose
variations in cloaking techniques across traffic sources. Our study provides the
first broad perspective of cloaking as it affects Google Search and Google Ads and
underscores the minimum capabilities necessary of security crawlers to bypass the
state of the art in mobile, rDNS, and IP cloaking.