Book Image

Getting Started with Python Web Scraping [Video]

By : Charles Clayton
Book Image

Getting Started with Python Web Scraping [Video]

By: Charles Clayton

Overview of this book

Python is a high-level programming language used for general-purpose programming. It has a design philosophy which emphasizes code readability and a syntax which allows programmers to express concepts in fewer lines of code than possible in languages such as C++ or Java. This video course is a rich collection of recipes that will come in handy when you are scraping a website using Python, addressing your usual and unusual problems while scraping websites by diving deep into the capabilities of Python’sweb scraping tools such as Selenium, BeautifulSoup, and urllib2. The video will start with showing how to use selenium module for scraping by setting up a web driver, debugging with the Console and downloading files and streamlining with a Headless Browser (PhantomJS). The video will then move on to demonstrate how to do parsing with Beautifulsoup which would include introduction to the BeautifulSoupObjects, Nested Selectors and Regular Expressions Basics and how to do UTF-8 Encoding. The video will finally end by showing how to do fetching with urlib2 by using the developer tools Network tab, how to bypass the browser and retrieve files. By The end of this video, you will be successfully able to understand the in-depth capabilities of python web scraping tools.
Table of Contents (3 chapters)
Chapter 1
Scraping with Selenium
Content Locked
Section 5
Using the Selenium Module
Now we know how to create CSS selectors and use the Chrome developer tools to look at HTML and construct a query, but how do we turn this into a Python script? We use the selenium module and a web driver. - Download the ChromeDriverWebDriver. - Install the Selenium module for Python. - Use these to write Python code to automate the browser.