Fast Nonparametric Conditional Density Estimation
Michael Holmes, Alexander Gray, Charles Isbell
Conditional density estimation generalizes regression by modeling a full density f(yjx) rather than only the expected value E(yjx). This is important for many tasks, including handling multi-modality and generating pre- diction intervals. Though fundamental and widely applicable, nonparametric conditional density estimators have received relatively little attention from statisticians and little or none from the machine learning commu- nity. None of that work has been applied to greater than bivariate data, presumably due to the computational difficulty of data-driven bandwidth selection. We describe the double kernel conditional density estimator and de- rive fast dual-tree-based algorithms for band- width selection using a maximum likelihood criterion. These techniques give speedups of up to 3.8 million in our experiments, and en- able the first applications to previously in- tractable large multivariate datasets, includ- ing a redshift prediction problem from the Sloan Digital Sky Survey.
Pages: 175-182
