Factorized Multi-Modal Topic Model
Seppo Virtanen, Yangqing Jia, Arto Klami, Trevor Darrell
Multi-modal data collections, such as corpora of paired images and text snippets, require analysis methods beyond single-view component and topic models. For continuous observations the current dominant approach is based on extensions of canonical correlation analysis, factorizing the variation into components shared by the different modalities and those private to each of them. For count data, multiple variants of topic models attempting to tie the modalities together have been presented. All of these, however, lack the ability to learn components private to one modality, and consequently will try to force dependencies even between minimally correlating modalities. In this work we combine the two approaches by presenting a novel HDP-based topic model that automatically learns both shared and private topics. The model is shown to be especially useful for querying the contents of one domain given samples of the other.
PDF Link: /papers/12/p843-virtanen.pdf
AUTHOR = "Seppo Virtanen
and Yangqing Jia and Arto Klami and Trevor Darrell",
TITLE = "Factorized Multi-Modal Topic Model",
BOOKTITLE = "Proceedings of the Twenty-Eighth Conference Annual Conference on Uncertainty in Artificial Intelligence (UAI-12)",
PUBLISHER = "AUAI Press",
ADDRESS = "Corvallis, Oregon",
YEAR = "2012",
PAGES = "843--851"