Book Image

Learning Bayesian Models with R

By : Hari Manassery Koduvely
Book Image

Learning Bayesian Models with R

By: Hari Manassery Koduvely

Overview of this book

Table of Contents (16 chapters)
Learning Bayesian Models with R
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Topic modeling using Bayesian inference


We have seen the supervised learning (classification) of text documents in Chapter 6, Bayesian Classification Models, using the Naïve Bayes model. Often, a large text document, such as a news article or a short story, can contain different topics as subsections. It is useful to model such intra-document statistical correlations for the purpose of classification, summarization, compression, and so on. The Gaussian mixture model learned in the previous section is more applicable for numerical data, such as images, and not for documents. This is because words in documents seldom follow normal distribution. A more appropriate choice would be multinomial distribution.

A powerful extension of mixture models to documents is the work of T. Hofmann on Probabilistic Semantic Indexing (reference 6 in the References section of this chapter) and that of David Blei, et. al. on Latent Dirichlet allocation (reference 7 in the References section of this chapter). In...