Book Image

Mastering Python for Data Science

By : Samir Madhavan
Book Image

Mastering Python for Data Science

By: Samir Madhavan

Overview of this book

Table of Contents (19 chapters)
Mastering Python for Data Science
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
7
Estimating the Likelihood of Events
Index

Creating a wordcloud


A worldcloud is a collage of words and those words that are bigger in size have a high frequency.

You can download wordcloud with the following command if you use Ubuntu:

$ pip install git+git://github.com/amueller/word_cloud.git

You can follow the instructions to do this by referring to https://github.com/amueller/word_cloud.

Let's plot the wordcloud for the BBC by using the following code:

>>> wordcloud = WordCloud(width = 1000, height = 500).generate(' '.join(data['bbc']))

>>> plt.figure(figsize=(15,8))

>>> plt.imshow(wordcloud)

>>> plt.axis("off")

>>> plt.show()

From the preceding wordcloud, we can make out that there are mentions about the long duration between the 80s Mad Max and the current Mad Max. The article talks about Mel Gibson, the cars, and the villain Immortan Joe as these are the most frequently occurring keywords. There is also an emphasis on different aspects of the movie given by the one keyword.

Now,...