Book Image

Mastering Predictive Analytics with R

By : Rui Miguel Forte, Rui Miguel Forte
Book Image

Mastering Predictive Analytics with R

By: Rui Miguel Forte, Rui Miguel Forte

Overview of this book

Table of Contents (19 chapters)
Mastering Predictive Analytics with R
Credits
About the Author
Acknowledgments
About the Reviewers
www.PacktPub.com
Preface
Index

Latent Dirichlet Allocation


Latent Dirichlet Allocation (LDA) is the prototypical method to perform topic modeling. Rather unfortunately, the acronym LDA is also used for another method in machine learning, Linear Discriminant Analysis. This latter method is completely different to Latent Dirichlet Allocation and is commonly used as a way to perform dimensionality reduction and classification. Needless to say, we will use LDA to refer to Latent Dirichlet Allocation throughout this book.

Although LDA involves a substantial amount of mathematics, it is worth exploring some of its technical details in order to understand how the model works and the assumptions that it uses. First and foremost, we should learn about the Dirichlet distribution, which lends its name to LDA.

Note

An excellent reference for a fuller treatment of Topic Models with LDA is the chapter Topic Models in the book Text Mining: Classification, Clustering, and Applications, edited by A. Srivastava and M. Sahami and published...