Book Image

Python Social Media Analytics

By : Baihaqi Siregar, Siddhartha Chatterjee, Michal Krystyanczuk
Book Image

Python Social Media Analytics

By: Baihaqi Siregar, Siddhartha Chatterjee, Michal Krystyanczuk

Overview of this book

Social Media platforms such as Facebook, Twitter, Forums, Pinterest, and YouTube have become part of everyday life in a big way. However, these complex and noisy data streams pose a potent challenge to everyone when it comes to harnessing them properly and benefiting from them. This book will introduce you to the concept of social media analytics, and how you can leverage its capabilities to empower your business. Right from acquiring data from various social networking sources such as Twitter, Facebook, YouTube, Pinterest, and social forums, you will see how to clean data and make it ready for analytical operations using various Python APIs. This book explains how to structure the clean data obtained and store in MongoDB using PyMongo. You will also perform web scraping and visualize data using Scrappy and Beautifulsoup. Finally, you will be introduced to different techniques to perform analytics at scale for your social data on the cloud, using Python and Spark. By the end of this book, you will be able to utilize the power of Python to gain valuable insights from social media data and use them to enhance your business processes.
Table of Contents (17 chapters)
Title Page
Credits
About the Authors
Acknowledgments
About the Reviewer
www.PacktPub.com
Customer Feedback
Preface

Chapter 1. Introduction to the Latest Social Media Landscape and Importance

Have you seen the movie The Social Network? If you have not, it could be a good idea to see it before you read this book. If you have, you may have seen the success story around Mark Zuckerberg and his company Facebook. This was possible due to power of the platform in connecting, enabling, sharing, and impacting the lives of almost two billion people on this planet.

The earliest social networks existed as far back as 1995; such as Yahoo (Geocities), theglobe.com, and tripod.com. These platforms were mainly to facilitate interaction among people through chat rooms. It was only at the end of the 90s that user profiles became the in thing in social networking platforms, allowing information about people to be discoverable, and therefore, providing a choice to make friends or not. Those embracing this new methodology were Makeoutclub, Friendster, SixDegrees.com, and so on.

MySpace, LinkedIn, and Orkut were thereafter created, and the social networks were on the verge of becoming mainstream. However, the biggest impact happened with the creation of Facebook in 2004; a total game changer for people's lives, business, and the world. The sophistication and the ease of using the platform made it into mainstream media for individuals and companies to advertise and sell their ideas and products. Hence, we are in the age of social media that has changed the way the world functions.

Since the last few years, there have been new entrants in the social media, which are essentially of different interaction models as compared to Facebook, LinkedIn, or Twitter. These are Pinterest, Instagram, Tinder, and others. Interesting example is Pinterest, which unlike Facebook, is not centered around people but is centered around interests and/or topics. It's essentially able to structure people based on their interest around these topics. CEO of Pinterest describes it as a catalog of ideas. Forums which are not considered as regular social networks, such as Facebook, Twitter, and others, are also very important social platforms. Unlike in Twitter or Facebook, forum users are often anonymous in nature, which enables them to make in-depth conversations with communities. Other non-typical social networks are video sharing platforms, such as YouTube and Dailymotion. They are non-typical because they are centered around the user-generated content, and the social nature is generated by the sharing of these content on various social networks and also the discussion it generates around the user commentaries. Social media is gradually changing from being platform centric to focusing more on experiences and features. In the future, we'll see more and more traditional content providers and services becoming social in nature through sharing and conversations. The term social media today includes not just social networks but every service that's social in nature with a wide audience.

To understand the importance of social media, it's interesting to look at the statistics of these platforms. It's estimated that out of around 3.4 billion internet users, 2.3 billion of them are active social media users. This is a staggering number, reinforcing the enormous importance of social media. In terms of users of individual social media platforms, Facebook leads the way with almost 1.6 billion active users. You must have heard the adage that if Facebook were a country, it would be second largest one after China and ahead of India. Other social platforms linked to Facebook are also benefiting from this user base, such as WhatsApp, hosting 1 billion users on its chat application, and Instagram, with 400 million on its image sharing social network.

Among other platforms, Tumblr and Twitter lead the way with 550 million and 320 million active users respectively. LinkedIn, the world's most popular professional social media has 100 million active users. Pinterest, which is a subject of a later chapter, also has 100 million active users. Seina and Weibo, the equivalents of Facebook and Twitter in China, alone host 222 million active users. In terms of growth and engagement, Facebook is still the fastest growing social media, way ahead of the rest. If we look at engagement, millennials (age group 18-34) spend close to 100 minutes on average per person per month on Facebook. The number is way lower for others. Among user-generated content and sharing platforms, YouTube is a leader with 300 hours of video uploaded every minute and 3.25 billion hours of video watched every month.

In this chapter, we will cover the following topics:

  • Social graph
  • Introduction to the latest social media landscape and importance
  • What does social data mean in the modern world?
  • Tools and their specificities to mine the social web (Python, APIs, and machine learning)