Uncertainty in Artificial Intelligence
First Name   Last Name   Password   Forgot Password   Log in!
    Proceedings   Proceeding details   Article details         Authors         Search    
Sequential Document Representations and Simplicial Curves
Guy Lebanon
Abstract:
The popular bag of words assumption represents a document as a histogram of word occurrences. While computationally efficient, such a representation is unable to maintain any sequential information. We present a continuous and differentiable sequential document representation that goes beyond the bag of words assumption, and yet is efficient and effective. This representation employs smooth curves in the multinomial simplex to account for sequential information. We discuss the representation and its geometric properties and demonstrate its applicability for the task of text classification.
Keywords:
Pages: 273-280
PS Link:
PDF Link: /papers/06/p273-lebanon.pdf
BibTex:
@INPROCEEDINGS{Lebanon06,
AUTHOR = "Guy Lebanon ",
TITLE = "Sequential Document Representations and Simplicial Curves",
BOOKTITLE = "Proceedings of the Twenty-Second Conference Annual Conference on Uncertainty in Artificial Intelligence (UAI-06)",
PUBLISHER = "AUAI Press",
ADDRESS = "Arlington, Virginia",
YEAR = "2006",
PAGES = "273--280"
}


hosted by DSL   •   site info   •   help