Topic Modeling Bibliography

Bibliometrics
Cross-language
Evaluation
Implementations
Inference
NLP
Networks
Non-parametric
Scalability
Social media
Temporal
Theory
User interface
Vision
Where to start
Show All (showing )
[BibTeX]
@article{airoldi2008mixed,
  author={Edoardo M. Airoldi and David M. Blei and Stephen E. Fienberg and Eric P. Xing},
  title={Mixed Membership Stochastic Blockmodels},
  journal={JMLR},
  year={2008},
  volume={9},
  pages={1981-2014},
}
[BibTeX]
@inproceedings{alsumait2009topic,
  author={Loulwah AlSumait and Daniel Barbará and James Gentle and Carlotta Domeniconi},
  title={Topic Significance Ranking of LDA Generative Models},
  booktitle={ECML},
  year={2009},
  url={http://www.springerlink.com/content/v3jth868647716kg/},
}
[BibTeX]
@inproceedings{andrzejewski2007statistical,
  author={David Andrzejewski and Anne Mulhern and Ben Liblit and Xiaojin Zhu},
  title={Statistical Debugging using Latent Topic Models},
  booktitle={ECML},
  year={2007},
}
[BibTeX]
@inproceedings{andrzejewski2009incorporating,
  author={David Andrzejewski and Xiaojin Zhu and Mark Craven},
  title={Incorporating domain knowledge into topic modeling via Dirichlet Forest priors},
  booktitle={ICML},
  year={2009},
  pages={25-32},
}
[BibTeX]
@inproceedings{andrzejewski2011framework,
  author={David Andrzejewski and Xiaojin Zhu and Mark Craven and Ben Recht},
  title={A Framework for Incorporating General Domain Knowledge into Latent Dirichlet Allocation using First-Order Logic},
  booktitle={IJCAI},
  year={2011},
}
[BibTeX]
@inproceedings{asuncion2008distributed,
  author={Arthur Asuncion and Padhraic Smyth and Max Welling},
  title={Asynchronous Distributed Learning of Topic Models},
  booktitle={NIPS},
  year={2008},
  pages={81-88},
  url={http://www.ics.uci.edu/~asuncion/pubs/NIPS_08.pdf},
}
[BibTeX]
@inproceedings{asuncion2009smoothing,
  author={Arthur Asuncion and Max Welling and Padhraic Smyth and Yee-Whye Teh},
  title={On Smoothing and Inference for Topic Models},
  booktitle={UAI},
  year={2009},
  url={http://www.ics.uci.edu/~asuncion/pubs/UAI_09.pdf},
}

A dense but excellent review of inference in topic models. Introduces CVB0, a method for collapsed variational inference surprisingly similar to Gibbs sampling.

[BibTeX]
@inproceedings{blei2003modeling,
  author={David Blei and Michael Jordan},
  title={Modeling Annotated Data},
  booktitle={SIGIR},
  year={2003},
}

This paper introduces CorrLDA for data that consists of text and images, where image "topics" are chosen only from topics that are assigned to the text in the same document.

[BibTeX]
@misc{blei-lda-c,
  author={David M. Blei},
  title={lda-c},
  year={2003},
  url={http://www.cs.princeton.edu/~blei/lda-c/},
}

lda-c implements LDA with variational inference in C.

[BibTeX]
@article{blei2003latent,
  author={David M. Blei and Andrew Ng and Michael Jordan},
  title={Latent Dirichlet allocation},
  journal={JMLR},
  year={2003},
  volume={3},
  pages={993-1022},
}
[BibTeX]
@inproceedings{blei2003hierarchical,
  author={David M. Blei and Thomas Griffiths and Michael Jordan and Joshua Tenenbaum},
  title={Hierarchical topic models and the nested Chinese restaurant process},
  booktitle={NIPS},
  year={2003},
  url={http://books.nips.cc/papers/files/nips16/NIPS2003_AA03.pdf},
}

Introduces hLDA, which models topics in a tree. Each document is generated by topics along a single path through the tree.

[BibTeX][Abstract]
@misc{blei2007nested,
  author={David M. Blei and Thomas L. Griffiths and Michael I. Jordan},
  title={The nested Chinese restaurant process and hierarchical topic models},
  year={2007},
  url={http://arxiv.org/abs/0710.0845},
}

We present the nested Chinese restaurant process (nCRP), a stochastic process which assigns probability distributions to infinitely-deep, infinitely-branching trees. We show how this stochastic process can be used as a prior distribution in a nonparametric Bayesian model of document collections. Specifically, we present an application to information retrieval in which documents are modeled as paths down a random tree, and the preferential attachment dynamics of the nCRP leads to clustering of documents according to sharing of topics at multiple levels of abstraction. Given a corpus of documents, a posterior inference algorithm finds an approximation to a posterior distribution over trees, topics and allocations of words to levels of the tree. We demonstrate this algorithm on several collections of scientific abstracts. This model exemplifies a recent trend in statistical machine learning-the use of nonparametric Bayesian methods to infer distributions on flexible data structures.

This is a longer version of Blei et al. 2004, which extends that paper's hLDA model to trees of unlimited depth.

[BibTeX]
@inproceedings{blei2006dynamic,
  author={David M. Blei and John D. Lafferty},
  title={Dynamic Topic Models},
  booktitle={ICML},
  year={2006},
  url={http://portal.acm.org/citation.cfm?id=1143859},
}
[BibTeX]
@article{blei2007correlated,
  author={David M. Blei and John D. Lafferty},
  title={A Correlated Topic model of Science},
  journal={AAS},
  year={2007},
  volume={1},
  number={1},
  pages={17-35},
}
[BibTeX]
@inproceedings{blei2007supervised,
  author={David M. Blei and Jon D. McAuliffe},
  title={Supervised Topic Models},
  booktitle={NIPS},
  year={2007},
  url={http://books.nips.cc/papers/files/nips20/NIPS2007_0893.pdf},
}
[BibTeX]
@article{blei2011introduction,
  author={David M. Blei},
  title={Introduction to Probabilistic Topic Models},
  journal={Communications of the ACM},
  year={2011},
  url={http://www.cs.princeton.edu/~blei/papers/Blei2011.pdf},
}

A high-level overview of probabilistic topic models.

[BibTeX]
@misc{block2011cvhdp,
  author={Brad Block},
  title={Collapsed variational HDP},
  year={2011},
  url={http://www.bradblock.com/tm-0.1.tar.gz},
}

This library contains Java source and class files implementing the Latent Dirichlet Allocation (single-threaded collapsed Gibbs sampling) and Hierarchical Dirichlet Process (multi-threaded collapsed variational inference) topic models. The models can be accessed through the command-line or through a simple Java API. Also included is a subset of the 20 Newsgroup dataset and results of experiments done on the dataset to confirm the correct operation and investigate some properties of the topic models. No third-party scientific libraries are required and all needed special functions are implemented and included.

[BibTeX]
@inproceedings{boydgraber2007topic,
  author={Jordan Boyd-Graber and David M. Blei and Xiaojin Zhu},
  title={A Topic Model for Word Sense Disambiguation},
  booktitle={EMNLP},
  year={2007},
}
[BibTeX]
@inproceedings{boydgraber2007turning,
  author={Jordan Boyd-Graber and David M. Blei},
  title={PUTOP: Turning Predominant Senses into a Topic Model for WSD},
  booktitle={SEMEVAL},
  year={2007},
}
[BibTeX]
@inproceedings{boydgraber2008syntactic,
  author={Jordan Boyd-Graber and David M. Blei},
  title={Syntactic Topic Models},
  booktitle={NIPS},
  year={2008},
  url={http://books.nips.cc/papers/files/nips21/NIPS2008_0319.pdf},
}
[BibTeX]
@inproceedings{boydgraber2009multilingual,
  author={Jordan Boyd-Graber and David M. Blei},
  title={Multilingual Topic Models for Unaligned Text},
  booktitle={UAI},
  year={2009},
}
[BibTeX]
@inproceedings{broniatowskimagee2010,
  author={David A. Broniatowski and Christopher L. Magee},
  title={Analysis of Social Dynamics on FDA Panels Using Social Networks Extracted From Meeting Transcripts},
  booktitle={SocCom},
  year={2010},
  url={http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5591237&tag=1},
}

Method for analyzing group decision making based on the Author-Topic Model

[BibTeX]
@inproceedings{broniatowskimagee2011,
  author={David A. Broniatowski and Christopher L. Magee},
  title={Towards A Computational Analysis of Status and Leadership Styles on FDA Panels},
  booktitle={SBP},
  year={2011},
  url={http://www.springerlink.com/content/w655v786lp583660/},
}

Incorporates temporal information to generate directed graphs based upon topic models

[BibTeX]
@misc{buntine-dca,
  author={Wray L. Buntine},
  title={Discrete Component Analysis},
  year={2009},
  url={http://www.nicta.com.au/people/buntinew/discrete_component_analysis},
}

C implementation of LDA and multinomial PCA.

[BibTeX]
@inproceedings{buntine2005discrete,
  author={Wray L. Buntine and Aleks Jakulin},
  title={Discrete Component Analysis},
  booktitle={SLSFS},
  year={2005},
  pages={1-33},
}
[BibTeX]
@inproceedings{buntine2009estimating,
  author={Wray L. Buntine},
  title={Estimating Likelihoods for Topic Models},
  booktitle={Asian Conference on Machine Learning},
  year={2009},
  url={http://www.nicta.com.au/__data/assets/pdf_file/0019/20746/sdca-0202.pdf},
}

Provides improved versions of some of the methods in Wallach et al. (2009) for calculating held-out probability.

[BibTeX]
@misc{buntine2014experiments,
  author={Wray L. Buntine and Swapnil Mishra},
  title={Experiments with Non-parametric Topic Models},
  year={2014},
  url={http://dl.acm.org/citation.cfm?id=2623691},
}

Non-parametric implementations of bursty models. The authors find that using fixed numbers of topics but optimizing hyperparameters provides a good approximation of a non-parametric HDP model.

[BibTeX]
@inproceedings{cai2007nus,
  author={Jun Fu Cai and Wee Sun Lee and Yee Whye Teh},
  title={NUS-ML: Improving Word Sense Disambiguation Using Topic Features},
  booktitle={SEMEVAL},
  year={2007},
}
[BibTeX]
@misc{r-lda,
  author={Jonathan Chang},
  title={R package 'lda'},
  year={2011},
  url={http://cran.r-project.org/web/packages/lda/},
}

This package implements latent Dirichlet allocation (LDA) and related models. This includes (but is not limited to) sLDA, corrLDA, and the mixed-membership stochastic blockmodel. Inference for all of these models is implemented via a fast collapsed Gibbs sampler writtten in C. Utility functions for reading/writing data typically used in topic models, as well as tools for examining posterior distributions are also included.

[BibTeX]
@inproceedings{chang2009relational,
  author={Jonathan Chang and David Blei},
  title={Relational Topic Models for Document Networks},
  booktitle={AIStats},
  year={2009},
}
[BibTeX]
@inproceedings{chemudugunta2006modeling,
  author={Chaitanya Chemudugunta and Padhraic Smyth and Mark Steyvers},
  title={Modeling General and Specific Aspects of Documents with a Probabilistic Topic Model},
  booktitle={NIPS},
  year={2006},
  url={http://www.datalab.uci.edu/papers/special_words_NIPS06.pdf},
}

This paper has two interesting extensions to LDA that account for the power-law distribution of word frequencies in real documents. First, a general "background" distribution represents common words. Second, a "special words" model allows each document to have some unique words.

[BibTeX]
@inproceedings{chen2011sampling,
  author={Changyou Chen and Lan Du and Wray Buntine},
  title={Sampling Table Configurations for the Hierarchical Poisson-Dirichlet Process},
  booktitle={ECML-PKDD},
  year={2011},
  url={http://www.nicta.com.au/pub?doc=4806},
}

A simple hierarchical Pitman-Yor LDA sampler that does not record "table" assignments. Perplexity is sometimes far superior to other methods.

[BibTeX]
@inproceedings{chang2009reading,
  author={Jonathan Chang and Jordan Boyd-Graber and Chong Wang and Sean Gerrish and David M. Blei},
  title={Reading Tea Leaves: How Humans Interpret Topic Models},
  booktitle={NIPS},
  year={2009},
  url={http://books.nips.cc/papers/files/nips22/NIPS2009_0125.pdf},
}
[BibTeX]
@inproceedings{das2011simultaneous,
  author={Pradipto Das and Rohini Srihari and Yun Fu},
  title={Simultaneous Joint and Conditional Modeling of Documents Tagged from Two Perspectives},
  booktitle={},
  year={2011},
  url={http://www.acsu.buffalo.edu/~pdas3/research/papers/CIKM/pdasCIKM11.pdf},
}
[BibTeX]
@inproceedings{das2015gaussian,
  author={Rajarshi Das and Manzil Zaheer and Chris Dyer},
  title={Gaussian LDA for topic Models with Word Embeddings},
  booktitle={},
  year={2015},
  url={http://rajarshd.github.io/papers/acl2015.pdf},
}
[BibTeX]
@inproceedings{daume2009markov,
  author={Hal Daumé III},
  title={Markov Random Topic Fields},
  booktitle={},
  year={2009},
}
[BibTeX]
@inproceedings{dai2011grouped,
  author={Andrew M. Dai and Amos J. Storkey},
  title={
The Grouped Author-Topic Model for Unsupervised Entity Resolution
},
  booktitle={ICANN},
  year={2011},
}
[BibTeX]
@article{deerwester1990indexing,
  author={Scott Deerwester and Susan T. Dumais and George W. Furnas and Thomas K. Landauer and Richard Harshman},
  title={Indexing by Latent Semantic Analysis},
  journal={JASIS},
  year={1990},
  volume={41},
  number={6},
  pages={391-407},
}
[BibTeX]
@inproceedings{dietz2007unsupervised,
  author={Laura Dietz and Steffen Bickel and Tobias Scheffer},
  title={Unsupervised prediction of citation influences},
  booktitle={ICML},
  year={2007},
}
[BibTeX]
@article{ding2008equivalence,
  author={Chris Ding and Tao Li and Wei Peng},
  title={On the Equivalence between Non-negative Matrix Factorization and Probabilistic Latent Semantic Indexing},
  journal={Computational Statistics and Data Analysis},
  year={2008},
  volume={52},
  pages={3913-3927},
}
[BibTeX]
@inproceedings{doyle2009accounting,
  author={Gabriel Doyle and Charles Elkan},
  title={Accounting for Burstiness in Topic Models},
  booktitle={ICML},
  year={2009},
  url={http://www.cs.utah.edu/~hal/tmp/icml/papers/162.pdf},
}

Replaces the standard multinomial distribution over topics with a Dirichlet-compound Multinomial (DCM).

[BibTeX]
@inproceedings{eisenstein2010latent,
  author={Jacob Eisenstein and Brendan O'Connor and Noah A. Smith and Eric P. Xing},
  title={A Latent Variable Model for Geographic Lexical Variation},
  booktitle={EMNLP},
  year={2010},
  url={http://www.cc.gatech.edu/~jeisenst/papers/emnlp2010.pdf},
}

The widely-reported Twitter dialects paper. Topics combine a word distribution with a bivariate normal over latitude and longitude.

[BibTeX]
@inproceedings{eisenstein2011sparse,
  author={Jacob Eisenstein and Amr Ahmed and Eric P. Xing},
  title={Sparse Additive Generative Models of Text},
  booktitle={ICML},
  year={2011},
  url={http://www.cc.gatech.edu/~jeisenst/papers/icml2011.pdf},
}

Presents a new generative model of text, based on the principle of sparse deviation from a background word distribution. This approach proves effective in supervised, unsupervised, and latent variable settings.

[BibTeX]
@article{erosheva2004mixed,
  author={Elena Erosheva and Stephen Fienberg and John Lafferty},
  title={Mixed Membership Models of Scientific Publications},
  journal={PNAS},
  year={2004},
  volume={101},
  number={Suppl. 1},
  pages={5220-5227},
}
[BibTeX]
@inproceedings{foulds2013stochastic,
  author={James R. Foulds and L. Boyles and C. DuBois and Padhraic Smyth and Max Welling},
  title={Stochastic Collapsed Variational Bayesian Inference for Latent Dirichlet Allocation},
  booktitle={KDD},
  year={2013},
}
[BibTeX]
@misc{gensim,
  author={Radim Řehůřek},
  title={gensim},
  year={2009},
  url={http://nlp.fi.muni.cz/projekty/gensim/},
}

Python package for topic modelling, includes distributed and online implementation of variational LDA.

[BibTeX]
@inproceedings{gerrish2010language,
  author={Sean Gerrish and David M. Blei},
  title={A language-based approach to measuring scholarly impact},
  booktitle={ICML},
  year={2010},
  url={http://www.cs.princeton.edu/~blei/papers/GerrishBlei2010.pdf},
}
[BibTeX]
@inproceedings{girolami2003on,
  author={Mark Girolami and Ata Kabán},
  title={On an equivalence between pLSI and LDA},
  booktitle={SIGIR},
  year={2003},
  pages={433-434},
}
[BibTeX]
@inproceedings{gohr2010visually,
  author={Andre Gohr and Myra Spiliopoulou and Alexander Hinneburg},
  title={Visually Summarizing the Evolution of Documents under a Social Tag},
  booktitle={KDIR},
  year={2010},
  url={http://users.informatik.uni-halle.de/~hinnebur/PS_Files/kdir2010_TT.pdf},
}
[BibTeX]
@inproceedings{gohr2009,
  author={Andre Gohr and Alexander Hinneburg and Rene Schult and Myra Spiliopoulou},
  title={Topic Evolution in a Stream of Documents},
  booktitle={SDM},
  year={2009},
  pages={859-870},
  url={http://users.informatik.uni-halle.de/~hinnebur/PS_Files/sdm09_APLSA.pdf},
}
[BibTeX]
@article{griffiths04finding,
  author={Thomas L. Griffiths and Mark Steyvers},
  title={Finding Scientific Topics},
  journal={PNAS},
  year={2004},
  volume={101},
  number={suppl. 1},
  pages={5228-5235},
}
[BibTeX]
@incollection{griffiths2004integrating,
  author={Thomas L. Griffiths and Mark Steyvers and David M. Blei and Joshua B. Tenenbaum},
  editor={},
  title={Integrating Topics and Syntax},
  booktitle={NIPS},
  year={2004},
  pages={537-544},
  url={http://books.nips.cc/papers/files/nips17/NIPS2004_0642.pdf},
}
[BibTeX]
@inproceedings{hall2008studying,
  author={David Hall and Daniel Jurafsky and Christopher D. Manning},
  title={Studying the History of Ideas Using Topic Models},
  booktitle={EMNLP},
  year={2008},
  pages={363-371},
}
[BibTeX][Abstract]
@techreport{heinrich2004parameter,
  author={Gregor Heinrich},
  title={Parameter Estimation for Text Analysis},
  year={2004},
  url={http://www.arbylon.net/publications/text-est.pdf},
}

Presents parameter estimation methods common with discrete probability distributions, which is of particular interest in text modeling. Starting with maximum likelihood, a posteriori and Bayesian estimation, central concepts like conjugate distributions and Bayesian networks are reviewed. As an application, the model of latent Dirichlet allocation (LDA) is explained in detail with a full derivation of an approximate inference algorithm based on Gibbs sampling, including a discussion of Dirichlet hyperparameter estimation.

[BibTeX]
@inproceedings{heinrich2009generic,
  author={Gregor Heinrich},
  title={A generic approach to topic models},
  booktitle={ECML/PKDD},
  year={2009},
  url={http://arbylon.net/publications/mixnet-gibbs.pdf},
}
[BibTeX]
@misc{heinrich2011infinite,
  author={Gregor Heinrich},
  title={Infinite LDA},
  year={2011},
  url={http://arbylon.net/projects/knowceans-ilda/knowceans-ilda.zip},
}

A simple implementation of a non-parametric model, where the number of topics is not fixed in advance. Uses Teh's direct assignment method for HDP.

[BibTeX]
@inproceedings{hinneburg2007bayesian,
  author={Alexander Hinneburg and Hans-Henning Gabriel and Andre Gohr},
  title={Bayesian Folding-In with Dirichlet Kernels for PLSI},
  booktitle={ICDM},
  year={2007},
  pages={499-504},
  url={http://users.informatik.uni-halle.de/~hinnebur/PS_Files/blsi_icdm07.pdf},
}
[BibTeX]
@inproceedings{hofmann1999plsa,
  author={Thomas Hofmann},
  title={Probilistic latent semantic analysis},
  booktitle={UAI},
  year={1999},
}
[BibTeX]
@inproceedings{hoffman2010online,
  author={Matthew Hoffman and David M. Blei and Francis Bach},
  title={Online Learning for Latent Dirichlet Allocation},
  booktitle={NIPS},
  year={2010},
}
[BibTeX]
@inproceedings{jagarlamudi2010extracting,
  author={Jagadeesh Jagarlamudi and Hal Daumé III},
  title={Extracting Multilingual Topics from Unaligned Comparable Corpora},
  booktitle={},
  year={2010},
  url={http://dx.doi.org/10.1007/978-3-642-12275-0_39},
  pages={444--456},
}
[BibTeX]
@inproceedings{johnson2010pcfgs,
  author={Mark Johnson},
  title={PCFGs, Topic Models, Adaptor Grammars, and Learning Topical Collocations and the Structure of Proper Names},
  booktitle={},
  year={2010},
}
[BibTeX][Abstract]
@inproceedings{kivinen2007learning,
  author={Jyri J. Kivinen and Erik B. Sudderth and Michael I. Jordan},
  title={Learning Multiscale Representations of Natural Scenes Using Dirichlet Processes},
  booktitle={ICCV},
  year={2007},
  url={http://www.cs.berkeley.edu/~jordan/papers/kivinen-sudderth-jordan-iccv07.pdf},
}

We develop nonparametric Bayesian models for multiscale representations of images depicting natural scene categories. Individual features or wavelet coefficients are marginally described by Dirichlet process (DP) mixtures, yielding the heavy-tailed marginal distributions characteristic of natural images. Dependencies between features are then captured with a hidden Markov tree, and Markov chain Monte Carlo methods used to learn models whose latent state space grows in complexity as more images are observed. By truncating the potentially infinite set of hidden states, we are able to exploit efficient belief propagation methods when learning these hierarchical Dirichlet process hidden Markov trees (HDP-HMTs) from data. We show that our generative models capture interesting qualitative structure in natural scenes, and more accurately categorize novel images than models which ignore spatial relationships among features.

The paper introduces a blocked Gibbs sampler for learning a nonparametric Bayesian topic model whose topic assignments are coupled with a tree-structured graphical model.

[BibTeX]
@inproceedings{lacoste2008disclda,
  author={Simon Lacoste-Julien and Fei Sha and Michael I. Jordan},
  title={DiscLDA: Discriminative Learning for Dimensionality Reduction and Classification},
  booktitle={NIPS},
  year={2008},
  url={http://books.nips.cc/papers/files/nips21/NIPS2008_0993.pdf},
}
[BibTeX]
@article{landauer1997solutions,
  author={Thomas K. Landauer and Susan T. Dumais},
  title={Solutions to Plato's problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge},
  journal={Psychological Review},
  year={1997},
  number={104},
}
[BibTeX]
@misc{vowpalwabbit,
  author={John Langford},
  title={Vowpal Wabbit},
  year={2011},
  url={https://github.com/JohnLangford/vowpal_wabbit/wiki},
}

VW includes an implementation of Hoffman et al.'s online variational LDA.

[BibTeX]
@techreport{li2007nonparametric,
  author={Wei Li and David Blei and Andrew McCallum},
  title={Nonparametric Bayes Pachinko Allocation},
  year={2007},
}
[BibTeX]
@inproceedings{lin2008joint,
  author={Wei-Hao Lin and Eric P. Xing and Alexander Hauptmann},
  title={A Joint Topic and Perspective Model for Ideological Discourse},
  booktitle={ECML PKDD},
  year={2008},
  pages={17-32},
  url={http://portal.acm.org/citation.cfm?id=1431999.1432002},
}
[BibTeX]
@inproceedings{madsen2005modeling,
  author={Rasmus Madsen and David Kauchak and Charles Elkan},
  title={Modeling Word Burstiness Using the Dirichlet Distribution},
  booktitle={ICML},
  year={2005},
}
[BibTeX]
@misc{mallet,
  author={Andrew Kachites McCallum},
  title={MALLET: A Machine Learning for Language Toolkit},
  year={2002},
  url={http://mallet.cs.umass.edu},
}

Implements Gibbs sampling for LDA in Java using fast sampling methods from Yao et al. MALLET also includes support for data preprocessing, classification, and sequence tagging.

[BibTeX]
@inproceedings{mccallum2005topic,
  author={Andrew McCallum and Andrés Corrada-Emmanuel and Xuerui Wang},
  title={Topic and Role Discovery in Social Networks},
  booktitle={IJCAI},
  year={2005},
}
[BibTeX]
@inproceedings{mehrotra2013improving,
  author={Rishabh Mehrotra and Scott Sanner and Wray Buntine and Lexing Xie},
  title={Improving LDA Topic Models for Microblogs
via Tweet Pooling and Automatic Labeling},
  booktitle={SIGIR},
  year={2013},
}

Merging tweets based on hashtags and imputed hashtags improves topic modeling.

[BibTeX]
@inproceedings{mei2007topic,
  author={Qiaozhu Mei and Xu Ling and Matthew Wondra and Hang Su and ChengXiang Zhai},
  title={Topic sentiment mixture: modeling facets and opinions in weblogs},
  booktitle={WWW},
  year={2007},
}
[BibTeX]
@inproceedings{mei2007automatic,
  author={Qiaozhu Mei and Xuehua Shen and ChengXiang Zhai},
  title={Automatic labeling of multinomial topic models},
  booktitle={KDD},
  year={2007},
  pages={490-499},
}
[BibTeX][Abstract]
@inproceedings{mei2008topic,
  author={Qiaozhu Mei and Deng Cai and Duo Zhang and ChengXiang Zhai},
  title={Topic modeling with network regularization},
  booktitle={WWW},
  year={2008},
  url={http://portal.acm.org/citation.cfm?id=1367512},
}

In this paper, we formally define the problem of topic modeling with network structure (TMN). We propose a novel solution to this problem, which regularizes a statistical topic model with a harmonic regularizer based on a graph structure in the data. The proposed method bridges topic modeling and social network analysis, which leverages the power of both statistical topic models and discrete regularization. The output of this model well summarizes topics in text, maps a topic on the network, and discovers topical communities. With concrete selection of a topic model and a graph-based regularizer, our model can be applied to text mining problems such as author-topic analysis, community discovery, and spatial text mining. Empirical experiments on two different genres of data show that our approach is effective, which improves text-oriented methods as well as network-oriented methods. The proposed model is general; it can be applied to any text collections with a mixture of topics and an associated network structure.

[BibTeX]
@inproceedings{mimno2007expertise,
  author={David Mimno and Andrew McCallum},
  title={Expertise Modeling for Matching Papers with Reviewers},
  booktitle={KDD},
  year={2007},
}
[BibTeX]
@inproceedings{mimno2007mining,
  author={David Mimno and Andrew McCallum},
  title={Mining a digital library for influential authors},
  booktitle={JCDL},
  year={2007},
}
[BibTeX]
@inproceedings{mimno2007hierarchical,
  author={David Mimno and Wei Li and Andrew McCallum},
  title={Mixtures of Hierarchical Topics with Pachinko Allocation},
  booktitle={ICML},
  year={2007},
}
[BibTeX]
@inproceedings{mimno2008dmr,
  author={David Mimno and Andrew McCallum},
  title={Topic models conditioned on arbitrary features with Dirichlet-multinomial regression},
  booktitle={UAI},
  year={2008},
  url={http://www.cs.umass.edu/~mimno/papers/dmr-uai.pdf},
}

Per-document Dirichlet priors over topic distributions are generated using a log-linear combination of observed document features and learned feature-topic parameters. Implemented in Mallet

[BibTeX]
@inproceedings{mimno2008gibbs,
  author={David Mimno and Hanna Wallach and Andrew McCallum},
  title={Gibbs Sampling for Logistic Normal Topic Models with Graph-Based Priors},
  booktitle={NIPS Workshop on Analyzing Graphs},
  year={2008},
  url={http://www.cs.umass.edu/~mimno/papers/sampledlgstnorm.pdf},
}

Introduces an auxiliary-variable method for Gibbs sampling in non-conjugate topic models.

[BibTeX]
@inproceedings{mimno2009polylingual,
  author={David Mimno and Hanna Wallach and Jason Naradowsky and David A. Smith and Andrew McCallum},
  title={Polylingual Topic Models},
  booktitle={EMNLP},
  year={2009},
  url={http://www.cs.umass.edu/~mimno/papers/mimno2009polylingual.pdf},
}
[BibTeX]
@inproceedings{mimno2011reconstructing,
  author={David Mimno},
  title={Reconstructing Pompeian Households},
  booktitle={UAI},
  year={2011},
  url={http://www.cs.princeton.edu/~mimno/papers/pompeii.pdf},
}
[BibTeX]
@inproceedings{mimno2011optimizing,
  author={David Mimno and Hanna Wallach and Edmund Talley and Miriam Leenders and Andrew McCallum},
  title={Optimizing Semantic Coherence in Topic Models},
  booktitle={EMNLP},
  year={2011},
  url={http://www.cs.princeton.edu/~mimno/papers/mimno-semantic-emnlp.pdf},
}

A simple, automated metric that uses only information contained in the training documents has strong ability to predict human judgments of topic coherence.

[BibTeX]
@inproceedings{mimno2011bayesian,
  author={David Mimno and David Blei},
  title={Bayesian Checking for Topic Models},
  booktitle={EMNLP},
  year={2011},
  url={http://www.cs.princeton.edu/~mimno/papers/mimno-ppcs-emnlp.pdf},
}

Posterior predictive checks are useful in detecting lack of fit in topic models and identifying which metadata-enriched models might be useful

[BibTeX]
@inproceedings{mukherjee2008relative,
  author={Indraneel Mukherjee and David Blei},
  title={Relative Performance Guarantees for Approximate Inference in Latent Dirichlet Allocation},
  booktitle={NIPS},
  year={2008},
  url={http://books.nips.cc/papers/files/nips21/NIPS2008_0434.pdf},
}
[BibTeX]
@inproceedings{mukherjee2008relative,
  author={Indraneel Mukherjee and David Blei},
  title={Relative Performance Guarantees for Approximate Inference in Latent Dirichlet Allocation},
  booktitle={NIPS},
  year={2008},
  url={http://books.nips.cc/papers/files/nips21/NIPS2008_0434.pdf},
}
[BibTeX]
@inproceedings{musat2011improving,
  author={Claudiu Musat and Julien Velcin and Stefan Trausan-Matu and Marian-Andrei Rizoiu},
  title={Improving Topic Evaluation Using Conceptual Knowledge},
  booktitle={IJCAI},
  year={2011},
}
[BibTeX]
@inproceedings{nallapati2008joint,
  author={Ramesh Nallapati and Amr Ahmed and Eric P. Xing and William Cohen},
  title={Joint Latent Topic Models for Text and Citations},
  booktitle={KDD},
  year={2008},
  url={http://portal.acm.org/citation.cfm?id=1401957},
  pages={542--550},
}

This is one of the first papers to address joint topic models of text and hyperlinks. Used as a baseline in the more recent Relational Topic Models. (R.N.)

[BibTeX]
@inproceedings{nallapati2007multiscale,
  author={Ramesh Nallapati and William Cohen and Susan Ditmore and John Lafferty and Kin Ung},
  title={Multi-scale Topic Tomography},
  booktitle={KDD},
  year={2007},
  url={http://portal.acm.org/citation.cfm?id=1281249},
  pages={520--529},
}

Models variation of topic content with time at various scales of resolution. A novel variant of dynamic topic models that uses the Poisson distribution for word generation, and wavelets. (R.N.)

[BibTeX]
@inproceedings{nallapati2007parallelized,
  author={Ramesh Nallapati and William Cohen and John Lafferty},
  title={Parallelized Variational EM for Latent Dirichlet Allocation: An experimental evaluation of speed and scalability},
  booktitle={ICDM workshop on high performance data mining},
  year={2007},
  url={http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.68.4178amp;&rep=rep1&type=pdf},
}

Early paper on parallel implementations of variational EM for LDA. (R.N.)

[BibTeX]
@misc{nallapati2010multi-lda-c,
  author={Ramesh Nallapati},
  title={multithreaded lda-c},
  year={2010},
  url={https://sites.google.com/site/rameshnallapati/software},
}

Multi Threaded extension of David Blei's LDA implementation in C. Speeds up the computation by orders of magnitude depending on the number of processors.

[BibTeX]
@inproceedings{newman2006statistical,
  author={David Newman and Chaitanya Chemudugunta and Padhraic Smyth},
  title={Statistical entity-topic models},
  booktitle={KDD},
  year={2006},
}
[BibTeX]
@article{newman2005probabilistic,
  author={D. Newman and S. Block},
  title={Probabilistic Topic Decomposition of an Eighteenth-Century American Newspaper},
  journal={JASIST},
  year={2006},
}
[BibTeX]
@inproceedings{newman2010automatic,
  author={David Newman and Jey Han Lau and Karl Grieser and Timothy Baldwin},
  title={Automatic Evaluation of Topic Coherence},
  booktitle={NAACL},
  year={2010},
}
[BibTeX][Abstract]
@inproceedings{ni2009multilingual,
  author={Xiaochuan Ni and Jian-Tao Sun and Jian Hu and Zheng Chen},
  title={Mining Multilingual Topics from Wikipedia},
  booktitle={WWW},
  year={2009},
  url={http://www2009.eprints.org/158/},
}

In this paper, we try to leverage a large-scale and multilingual knowledge base, Wikipedia, to help effectively analyze and organize Web information written in different languages. Based on the observation that one Wikipedia concept may be described by articles in different languages, we adapt existing topic modeling algorithm for mining multilingual topics from this knowledge base. The extracted "universal" topics have multiple types of representations, with each type corresponding to one language. Accordingly, new documents of different languages can be represented in a space using a group of universal topics, which makes various multilingual Web applications feasible.

[BibTeX]
@misc{gibbslda++,
  author={Xuan-Hieu Phan and Cam-Tu Nguyen},
  title={GibbsLDA++},
  year={2007},
  url={http://gibbslda.sourceforge.net},
}

C/C++ implementation of LDA with Gibbs sampling.

[BibTeX]
@inproceedings{perkio2004exploring,
  author={Jukka Perkiö and Wray L. Buntine and Sami Perttu},
  title={Exploring Independent Trends in a Topic-Based Search Engine},
  booktitle={Web Intelligence},
  year={2004},
  pages={664-668},
}
[BibTeX]
@inproceedings{purver2006unsupervised,
  author={Matthew Purver and Konrad Körding and Thomas L. Griffiths and Joshua Tenenbaum},
  title={Unsupervised Topic Modelling for Multi-Party Spoken Discourse},
  booktitle={ACL},
  year={2006},
  url={http://web.mit.edu/cocosci/Papers/purver-et-al06acl.pdf},
}
[BibTeX]
@misc{ramage-tmt,
  author={Daniel Ramage and Evan Rosen},
  title={Stanford Topic Modeling Toolbox},
  year={2009},
  url={http://nlp.stanford.edu/software/tmt/tmt-0.3/},
}

Scala implementation of LDA and LabeledLDA. Input and output integration with Excel.

[BibTeX]
@inproceedings{ramage2009labeled,
  author={Daniel Ramage and David Hall and Ramesh Nallapati and Christopher D. Manning},
  title={Labeled LDA: A Supervised Topic Model for Credit Attribution in Multi-Labeled Corpora},
  booktitle={EMNLP},
  year={2009},
}
[BibTeX]
@inproceedings{ramage2010characterizing,
  author={Daniel Ramage and Susan Dumais and Dan Liebling},
  title={Characterizing Microblogs with Topic Models},
  booktitle={ICWSM},
  year={2010},
  url={http://www.stanford.edu/~dramage/papers/twitter-icwsm10.pdf},
}
[BibTeX][Abstract]
@inproceedings{reisinger2010spherical,
  author={Joseph Reisinger and Austin Waters and Brian Silverthorn and Raymond J. Mooney},
  title={Spherical Topic Models},
  booktitle={ICML},
  year={2010},
  url={http://www.cs.utexas.edu/users/ml/papers/reisinger.icml10.pdf},
}

We introduce the Spherical Admixture Model (SAM), a Bayesian topic model for arbitrary L2 normalized data. SAM maintains the same hierarchical structure as Latent Dirichlet Allocation (LDA), but models documents as points on a high-dimensional spherical manifold, allowing a natural likelihood parameterization in terms of cosine distance. Furthermore, SAM can model word absence/presence at the document level, and unlike previous models can assign explicit negative weight to topic terms. Performance is evaluated empirically, both through human ratings of topic quality and through diverse classification tasks from natural language processing and computer vision. In these experiments, SAM consistently outperforms existing models.

[BibTeX]
@inproceedings{rosenzvi2004author,
  author={Michal Rosen-Zvi and Tom Griffiths and Mark Steyvers and Padhraic Smyth},
  title={The Author-Topic Model for Authors and Documents},
  booktitle={UAI},
  year={2004},
}
[BibTeX]
@inproceedings{salakhutdinov2009replicated,
  author={Ruslan Salakhutdinov and Geoffrey Hinton},
  title={Replicated Softmax: an Undirected Topic Model},
  booktitle={NIPS},
  year={2009},
  url={http://books.nips.cc/papers/files/nips22/NIPS2009_0817.pdf},
}
[BibTeX]
@inproceedings{sievert2014ldavis,
  author={Carson Sievert and Kenneth E. Shirley},
  title={LDAvis: A method for visualizing and interpreting topics},
  booktitle={Proceedings of the Workshop on Interactive Language Learning, Visualization, and Interfaces},
  year={2014},
  url={https://github.com/cpsievert/LDAvis},
}
[BibTeX]
@misc{YahooLDA,
  author={Shravan Narayanamurthy},
  title={Yahoo! LDA},
  year={2011},
  url={https://github.com/shravanmn/Yahoo_LDA/wiki},
}

Y!LDA implements a fast, sampling-based, distributed algorithm. See Smola and Narayanamurthy for details.

[BibTeX]
@inproceedings{smola2010architecture,
  author={Alexander Smola and Shravan Narayanamurthy},
  title={An Architecture for Parallel Topic Models},
  booktitle={VLDB},
  year={2010},
}
[BibTeX]
@misc{steyvers-tmtb,
  author={Mark Steyvers and Tom Griffiths},
  title={Matlab Topic Modeling Toolbox},
  year={2005},
  url={http://psiexp.ss.uci.edu/research/programs_data/toolbox.htm},
}

Implements LDA, Author-Topic, HMM-LDA, LDA-COL. Tools for 2D visualization.

[BibTeX]
@incollection{steyvers2006probabilistic,
  author={Mark Steyvers and Tom Griffiths},
  editor={Landauer, T. and Mcnamara, D. and Dennis, S. and Kintsch, W.},
  title={Probabilistic Topic Models},
  booktitle={Latent Semantic Analysis: A Road to Meaning.},
  year={2006},
  publisher={Laurence Erlbaum},
  url={http://cocosci.berkeley.edu/tom/papers/SteyversGriffiths.pdf},
}

A good introduction to topic modeling.

[BibTeX]
@inproceedings{taranto2011rslda,
  author={Claudio Taranto and Nicola Di Mauro and Floriana Esposito},
  title={rsLDA: a Bayesian Hierarchical Model for Relational Learning},
  booktitle={ICDKE},
  year={2011},
  url={http://www.di.uniba.it/~ndm/publications/files/taranto11icdke.pdf},
}
[BibTeX]
@inproceedings{teh2006collapsed,
  author={Yee-Whye Teh and David Newman and Max Welling},
  title={A Collapsed Variational Bayesian Inference Algorithm for Latent Dirichlet Allocation},
  booktitle={NIPS},
  year={2006},
  url={http://books.nips.cc/papers/files/nips19/NIPS2006_0511.pdf},
}
[BibTeX]
@article{teh2006hierarchical,
  author={Yee Whye Teh and Michael I. Jordan and Matthew J. Beal and David M. Blei},
  title={Hierarchical Dirichlet Processes},
  journal={JASA},
  year={2006},
  url={http://dx.doi.org/10.1198/016214506000000302},
  volume={101},
}
[BibTeX]
@inproceedings{toutanova2007bayesian,
  author={Kristina Toutanova and Mark Johnson},
  title={A Bayesian LDA-based model for semi-supervised part-of-speech tagging},
  booktitle={NIPS},
  year={2007},
  pages={1521-1528},
  url={http://books.nips.cc/papers/files/nips20/NIPS2007_0964.pdf},
}
[BibTeX]
@inproceedings{wallach2006beyond,
  author={Hanna M. Wallach},
  title={Topic modeling: beyond bag-of-words},
  booktitle={ICML},
  year={2006},
}
[BibTeX]
@inproceedings{wallach2009evaluation,
  author={Hanna Wallach and Iain Murray and Ruslan Salakhutdinov and David Mimno},
  title={Evaluation Methods for Topic Models},
  booktitle={ICML},
  year={2009},
  url={http://www.cs.umass.edu/~mimno/papers/wallach09evaluation.pdf},
}

Commonly used methods for estimating the probability of held-out words may be unstable. This paper presents more accurate methods.

[BibTeX]
@inproceedings{wallach2009rethinking,
  author={Hanna Wallach and David Mimno and Andrew McCallum},
  title={Rethinking LDA: Why priors matter},
  booktitle={NIPS},
  year={2009},
  url={http://www.cs.umass.edu/~mimno/papers/NIPS2009_0929.pdf},
}

The use of an asymmetric Dirichlet prior on per-document topic distributions reduces sensitivity to very common words (eg stopwords and near-stopwords) and makes topic assignments more stable as the number of topics grows.

[BibTeX]
@inproceedings{wang2009multiscale,
  author={Chang Wang and Sridhar Mahadevan},
  title={Multiscale Analysis of Document Corpora Based on Diffusion Models},
  booktitle={IJCAI},
  year={2009},
  url={http://www.cs.umass.edu/~chwang/papers/IJCAI-2009-TD.pdf},
}
[BibTeX]
@inproceedings{wang2011relation,
  author={Chang Wang and James Fan and Aditya Kalyanpur and David Gondek},
  title={Relation Extraction with Relation Topics},
  booktitle={EMNLP},
  year={2011},
  url={http://www-all.cs.umass.edu/~chwang/EMNLP-2011.pdf},
}
[BibTeX]
@inproceedings{wang2005group,
  author={Xuerui Wang and Natasha Mohanty and Andrew McCallum},
  title={Group and Topic Discovery from Relations and Their Attributes},
  booktitle={NIPS},
  year={2005},
  url={http://books.nips.cc/papers/files/nips18/NIPS2005_0819.pdf},
}
[BibTeX]
@inproceedings{wang2006topics,
  author={Xuerui Wang and Andrew McCallum},
  title={Topics Over Time: a non-Markov continuous-time model of topical trends},
  booktitle={KDD},
  year={2006},
}
[BibTeX][Abstract]
@inproceedings{wang2008continuous,
  author={Chong Wang and David M. Blei and David Heckerman},
  title={Continuous Time Dynamic Topic Models},
  booktitle={UAI},
  year={2008},
  url={http://uai2008.cs.helsinki.fi/UAI_camera_ready/wang.pdf},
}

In this paper, we develop the continuous time dynamic topic model (cDTM). The cDTM is a dynamic topic model that uses Brownian motion to model the latent topics through a sequential collection of documents, where a "topic" is a pattern of word use that we expect to evolve over the course of the collection. We derive an efficient variational approximate inference algorithm that takes advantage of the sparsity of observations in text, a property that lets us easily handle many time points. In contrast to the cDTM, the original discrete-time dynamic topic model (dDTM) requires that time be discretized. Moreover, the complexity of variational inference for the dDTM grows quickly as time granularity increases, a drawback which limits fine-grained discretization. We demonstrate the cDTM on two news corpora, reporting both predictive perplexity and the novel task of time stamp prediction.

[BibTeX]
@inproceedings{wang2009simulataneous,
  author={Chong Wang and David Blei and Fei-Fei Li},
  title={Simultaneous Image Classification and Annotation},
  booktitle={CVPR},
  year={2009},
}
[BibTeX]
@misc{wang2011gritty,
  author={Yi Wang},
  title={Distributed Gibbs Sampling of Latent Dirichlet Allocation: The Gritty Details},
  year={2011},
  url={http://dbgroup.cs.tsinghua.edu.cn/wangyi/lda/lda.pdf},
}

A thorough introduction for those wanting to understand the mathematical basics of topic models.

[BibTeX]
@inproceedings{wei2006pachinko,
  author={Wei Li and Andrew McCallum},
  title={Pachinko allocation: DAG-structured mixture models of topic correlations},
  booktitle={ICML},
  year={2006},
}
[BibTeX]
@inproceedings{wei2006lda,
  author={Xing Wei and Bruce Croft},
  title={LDA-based document models for ad-hoc retrieval},
  booktitle={SIGIR},
  year={2006},
}
[BibTeX]
@inproceedings{yan2009parallel,
  author={Feng Yan and Ningyi Xu and Yuan Qi},
  title={Parallel Inference for Latent Dirichlet Allocation on Graphics Processing Units},
  booktitle={NIPS},
  year={2009},
  url={http://books.nips.cc/papers/files/nips22/NIPS2009_0546.pdf},
}

In addition to dividing the corpus between processors, this work divides the vocabulary into the same number of partitions, such that each processor works on both its own documents and its own words at each epoch. This increases the number of epochs, but drastically reduces the possibility of incorrect samples.

[BibTeX]
@inproceedings{yang2011bridging,
  author={Shuang-Hong Yang and Steven P. Crain and Hongyuan Zha},
  title={Bridging the language gap: topic adaptation for documents with different technicality},
  booktitle={AIStats},
  year={2011},
  url={http://jmlr.csail.mit.edu/proceedings/papers/v15/yang11b/yang11b.pdf},
}
[BibTeX]
@inproceedings{yao2009efficient,
  author={Limin Yao and David Mimno and Andrew McCallum},
  title={Efficient Methods for Topic Model Inference on Streaming Document Collections},
  booktitle={KDD},
  year={2009},
  url={http://www.cs.umass.edu/~mimno/papers/fast-topic-model.pdf},
}

Explores methods for inferring topic distributions for new documents given a trained model. This paper includes the SparseLDA algorithm and data structure, which can dramatically improve time and memory performance in Gibbs sampling.

[BibTeX]
@inproceedings{zhang2010evolutionary,
  author={Jianwen Zhang and Yangqiu Song and Changshui Zhang and Shixia Liu},
  title={Evolutionary Hierarchical Dirichlet Processes for Multiple Correlated Time-varying Corpora},
  booktitle={KDD},
  year={2010},
  url={http://research.microsoft.com/en-us/um/people/shliu/p1079-zhang.pdf},
}
[BibTeX]
@inproceedings{zhao2006bitam,
  author={Bing Zhao and Eric P. Xing},
  title={BiTAM: Bilingual Topic AdMixture Models for Word Alignment},
  booktitle={ACL},
  year={2006},
  url={http://www.aclweb.org/anthology/P/P06/P06-2124},
}
[BibTeX]
@inproceedings{zhao2006hmbitam,
  author={Bin Zhao and Eric P. Xing},
  title={HM-BiTAM: Bilingual Topic Exploration, Word Alignment, and Translation},
  booktitle={NIPS},
  year={2007},
  url={http://books.nips.cc/papers/files/nips20/NIPS2007_0188.pdf},
}
[BibTeX]
@inproceedings{zhu2009medlda,
  author={Jun Zhu and Amr Ahmed and Eric P. Xing},
  title={MedLDA: Maximum Margin Supervised Topic Models for Regression and Classification},
  booktitle={ICML},
  year={2009},
}
[BibTeX]
@inproceedings{zhu2010conditional,
  author={Jun Zhu and Eric P. Xing},
  title={Conditional Topic Random Fields},
  booktitle={ICML},
  year={2010},
}
[BibTeX][Abstract]
@techreport{zhu2006taglda,
  author={Xiaojin Zhu and David M. Blei and John Lafferty},
  title={TagLDA: Bringing document structure knowledge into topic models},
  year={2006},
  institution={University of Wisconsin, Madison},
  number={TR-1553},
}

Latent Dirichlet Allocation models a document by a mixture of topics, where each topic itself is typically modeled by a unigram word distribution. Documents however often have known structures, and the same topic can exhibit different word distributions under different parts of the structure. We extend latent Dirichlet allocation model by replacing the unigram word distributions with a factored representation conditioned on both the topic and the structure. In the resultant model each topic is equivalent to a set of unigrams, reflecting the structure a word is in. The proposed model is more flexible in modeling the corpus. The factored representation prevents combinatorial explosion and leads to efficient parameterization. We derive the variational optimization algorithm for the new model. The model shows improved perplexity on text and image data, but no significant accuracy improvement when used for classification.