Uncertainty in Artificial Intelligence
First Name   Last Name   Password   Forgot Password   Log in!
    Proceedings         Authors   Author's Info   Article details         Search    
The Anchors Hierachy: Using the triangle inequality to survive high dimensional data
Andrew Moore
This paper is about metric data structures in high-dimensional or non-Euclidean space that permit cached sufficient statistics accelerations of learning algorithms. It has recently been shown that for less than about 10 dimensions, decorating kd-trees with additional ``cached sufficient statistics'' such as first and second moments and contingency tables can provide satisfying acceleration for a very wide range of statistical learning tasks such as kernel regression, locally weighted regression, k-means clustering, mixture modeling and Bayes Net learning. In this paper, we begin by defining the anchors hierarchy---a fast data structure and algorithm for localizing data based only on a triangle-inequality-obeying distance metric. We show how this, in its own right, gives a fast and effective clustering of data. But more importantly we show how it can produce a well-balanced structure similar to a Ball-Tree (Omohundro, 1991) or a kind of metric tree (Uhlmann, 1991; Ciaccia, Patella, & Zezula, 1997) in a way that is neither ``top-down'' nor ``bottom-up'' but instead ``middle-out''. We then show how this structure, decorated with cached sufficient statistics, allows a wide variety of statistical learning algorithms to be accelerated even in thousands of dimensions.
Pages: 397-405
PS Link: http://www.cs.cmu.edu/afs/cs.cmu.edu/project/reinforcement/papers/anchors.ps
PDF Link: /papers/00/p397-moore.pdf
AUTHOR = "Andrew Moore ",
TITLE = "The Anchors Hierachy: Using the triangle inequality to survive high dimensional data",
BOOKTITLE = "Proceedings of the Sixteenth Conference Annual Conference on Uncertainty in Artificial Intelligence (UAI-00)",
PUBLISHER = "Morgan Kaufmann",
ADDRESS = "San Francisco, CA",
YEAR = "2000",
PAGES = "397--405"

hosted by DSL   •   site info   •   help