Probabilistic Models for Anomaly Detection in Remote Sensor Data Streams
Ethan Dereszynski, Thomas Dietterich
Remote sensors are becoming the standard for observing and recording ecological data in the field. Such sensors can record data at fine temporal resolutions, and they can oper- ate under extreme conditions prohibitive to human access. Unfortunately, sensor data streams exhibit many kinds of errors rang- ing from corrupt communications to partial or total sensor failures. This means that the raw data stream must be cleaned before it can be used by domain scientists. In our application environment|the H.J. Andrews Experimental Forest|this data cleaning is performed manually. This paper introduces a Dynamic Bayesian Network model for ana- lyzing sensor observations and distinguishing sensor failures from valid data for the case of air temperature measured at 15 minute time resolution. The model combines an accu- rate distribution of long-term and short-term temperature variations with a single general- ized fault model. Experiments with histor- ical data show that the precision and recall of the method is comparable to that of the domain expert. The system is currently be- ing deployed to perform real-time automated data cleaning.
PDF Link: /papers/07/p75-dereszynski.pdf
AUTHOR = "Ethan Dereszynski
and Thomas Dietterich",
TITLE = "Probabilistic Models for Anomaly Detection in Remote Sensor Data Streams",
BOOKTITLE = "Proceedings of the Twenty-Third Conference Annual Conference on Uncertainty in Artificial Intelligence (UAI-07)",
PUBLISHER = "AUAI Press",
ADDRESS = "Corvallis, Oregon",
YEAR = "2007",
PAGES = "75--82"