Book Image

Python Social Media Analytics

By : Baihaqi Siregar, Siddhartha Chatterjee, Michal Krystyanczuk
Book Image

Python Social Media Analytics

By: Baihaqi Siregar, Siddhartha Chatterjee, Michal Krystyanczuk

Overview of this book

Social Media platforms such as Facebook, Twitter, Forums, Pinterest, and YouTube have become part of everyday life in a big way. However, these complex and noisy data streams pose a potent challenge to everyone when it comes to harnessing them properly and benefiting from them. This book will introduce you to the concept of social media analytics, and how you can leverage its capabilities to empower your business. Right from acquiring data from various social networking sources such as Twitter, Facebook, YouTube, Pinterest, and social forums, you will see how to clean data and make it ready for analytical operations using various Python APIs. This book explains how to structure the clean data obtained and store in MongoDB using PyMongo. You will also perform web scraping and visualize data using Scrappy and Beautifulsoup. Finally, you will be introduced to different techniques to perform analytics at scale for your social data on the cloud, using Python and Spark. By the end of this book, you will be able to utilize the power of Python to gain valuable insights from social media data and use them to enhance your business processes.
Table of Contents (17 chapters)
Title Page
About the Authors
About the Reviewer
Customer Feedback

Introducing social graph

A social graph is created through this widespread interaction and exchange of information on social media. A social graph is a massive network that illustrates the relations between individuals and information on the internet. Facebook owns the largest social graph of relations of 1.5 billion users. Every social media has its own social graph. The nature of social graph and its utility can be of various types, based on the types of relations described as follows. We will show a concrete example of a piece of the social graph and how to analyze it.

  • User graph: This is a network that shows the relationships between users or individuals connected to each other.
  • Content graph: As there are billions of content being uploaded on social media, there is a relationship existing between different types of content (text, images, videos, or multimedia). These relations could be based on semantic sense around those content, or in-bond or out-bond links between them, like that of the Google's page rank.
  • Interest graph: The interest graph takes the original graphs a step further, where individuals on the social media or the internet are not related based on their mere links, like being added as a friend or followed on Twitter, but on their mutual interests. This has a huge advantage over standard social graph, in the sense that it leads to finding communities of people with similar interests. Even if these people have no interaction or know each other personally, there is an inherent link based on their interests and passions.

Notion of influence

This massive growth and interaction on the social web is leading the way to understand these individuals. Like in a society there are influencers, the same phenomenon is getting replicated on the social web. There are people who have more influence over other users. The process of finding influencers and calculating influence is becoming an important science. If you have used a service called Klout, you'll know what we are talking about. Klout gives a 'social influence score' based on your social media activities. There are questions about the relevance of such scores, but that's only because the influence of a person is a very relative topic. In fact, in our view, no one is an influencer while everyone's an influencer. This can sound very confusing but what we are trying to say is that influence is relative. Someone who is an influencer to you may not be an influencer to another. If you need admission of your child to a school, the principal of the school is an influencer, but if you are seeking admission to a university, the same principal is not an influencer to you. This confusion makes the topic super exciting; trying to understand human dynamics and then figuring out who influences whom and how. Merely having thousands of followers on Twitter doesn't make one an influencer but the influencer of his or her followers and the way they are influenced to take action, sure does. Our book will not get into detailed aspects of influence but it's important to keep in mind this notion while trying to understand social media analytics.

Social impacts

Social media is already having a profound influence on both society and business. The societal impact has been both psychological and behavioral. Various events, crises, and issues in the world have received a boost because of the use of social media by millions of people. Stating a few examples would be that of the Arab Spring and the refugee crisis. In environmental crisis, such as earthquakes, social media like Twitter has proved in accelerating information and action because of its immediacy of dissemination and spread.

Platforms on platform

Social media companies like Facebook started presenting their technology as a platform, where programmers could build further tools to give rise to more social experiences, such as games, contests, and quizzes, which in turn gave rise to social interactions and experiences beyond mere conversational interaction. Today, there is a range of tools that allows one to build over the platforms. Another application of this is to gather intelligence through the data collected from these platforms. Twitter shares a lot of its data around the usage of its platform with programmers and companies. Similarly, most of the popular social networks have started sharing their data with developers and data warehousing companies. Sharing their data serves revenue growth, and is also a very interesting source for researchers and marketers to learn about people and the world.