Failure Trends in a Large Disk Drive Population
Venue
5th USENIX Conference on File and Storage Technologies (FAST 2007), pp. 17-29
Publication Year
2007
Authors
Eduardo Pinheiro, Wolf-Dietrich Weber, Luiz André Barroso
BibTeX
Abstract
We present data collected from detailed observations of a large disk drive population in a production Internet services deployment. The population observed is many times larger than that of previous studies. In addition to presenting failure statistics, we analyze the correlation between failures and several parameters generally believed to impact longevity.
Our analysis identifies several parameters from the drive’s self monitoring facility (SMART) that correlate highly with failures. Despite this high correlation, we conclude that models based on SMART parameters alone are unlikely to be useful for predicting individual drive failures. Surprisingly, we found that temperature and activity levels were much less correlated with drive failures than previously reported.
