Book Image

Mastering RabbitMQ

By : Yusuf Aytas, Emrah Ayanoglu, Dotan Nahum
Book Image

Mastering RabbitMQ

By: Yusuf Aytas, Emrah Ayanoglu, Dotan Nahum

Overview of this book

RabbitMQ is one of the most powerful Open Source message broker software, which is widely used in tech companies such as Mozilla, VMware, Google, AT&T, and so on. RabbitMQ gives you lots of fantastic and easy-to-manage functionalities to control and manage the messaging facility with lots of community support. As scalability is one of our major modern problems, messaging with RabbitMQ is the main part of the solution to this problem This book explains and demonstrates the RabbitMQ server in a detailed way. It provides you with lots of real-world examples and advanced solutions to tackle the scalability issues. You’ll begin your journey with the installation and configuration of the RabbitMQ server, while also being given specific details pertaining to the subject. Next, you’ll study the major problems that our server faces, including scalability and high availability, and try to get the solutions for both of these issues by using the RabbitMQ mechanisms. Following on from this, you’ll get to design and develop your own plugins using the Erlang language and RabbitMQ’s internal API. This knowledge will help you to start with the management and monitoring of the messages, tools, and applications. You’ll also gain an understanding of the security and integrity of the messaging facilities that RabbitMQ provides. In the last few chapters, you will build and keep track of your clients (senders and receivers) using Java, Python, and C#.
Table of Contents (18 chapters)
Mastering RabbitMQ
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Preface
Index

Implementing the scraper


Scraper would be a system of copying content of other websites using web scraping. First, we want to state a few of the things that we want to accomplish:

  • Downloading a web page

  • Parsing HTML

  • Cherry-picking attributes from the HTML

  • Saving the results

For a modern way to fetch content from the web, we will avoid the standard urllib library and go directly with the nicer requests library from the Python community.

For parsing and drilling into web pages, we'll use the almost de-facto library for this in the Python world—BeautifulSoup.

Let's fetch these via pip:

$ pip install requests beautifulsoup
Requirement already satisfied (use --upgrade to upgrade): requests in /Library/Python/2.7/site-packages/requests-2.2.1-py2.7.egg
Downloading/unpacking beautifulsoup
  Downloading BeautifulSoup-3.2.1.tar.gz
  Running setup.py (path:/private/var/folders/gw/xp4xsqt97957cc7hcgxd0w0c0000gn/T/pip_build_dotan/beautifulsoup/setup.py) egg_info for package beautifulsoup

Installing collected...