Large-Sample Learning of Bayesian Networks is NP-Hard
David Chickering, Christopher Meek, David Heckerman
In this paper, we provide new complexity results for algorithms that learn discrete-variable Bayesian networks from data. Our results apply whenever the learning algorithm uses a scoring criterion that favors the simplest model able to represent the generative distribution exactly. Our results therefore hold whenever the learning algorithm uses a consistent scoring criterion and is applied to a sufficiently large dataset. We show that identifying high-scoring structures is hard, even when we are given an independence oracle, an inference oracle, and/or an information oracle. Our negative results also apply to the learning of discrete-variable Bayesian networks in which each node has at most k parents, for all k > 3.
PDF Link: /papers/03/p124-chickering.pdf
AUTHOR = "David Chickering
and Christopher Meek and David Heckerman",
TITLE = "Large-Sample Learning of Bayesian Networks is NP-Hard",
BOOKTITLE = "Proceedings of the Nineteenth Conference Annual Conference on Uncertainty in Artificial Intelligence (UAI-03)",
PUBLISHER = "Morgan Kaufmann",
ADDRESS = "San Francisco, CA",
YEAR = "2003",
PAGES = "124--133"