Book Image

Data Acquisition and Manipulation with Python [Video]

By : Curtis Miller
Book Image

Data Acquisition and Manipulation with Python [Video]

By: Curtis Miller

Overview of this book

Python, a multi-paradigm programming language, has become the language of choice for data scientists for data analysis, visualization, and machine learning. In this course, you’ll start by learning how to acquire data from the web in its already “clean” format, such as in a .csv file, or a database. You’ll then learn to transform this data so it’s in its most useful format for analysis. After that, you’ll dive into data aggregation and grouping, where you’ll learn to group similar data for easier analysis purposes. From there, you’ll be shown different methods of web scraping using Python. Finally, you’ll learn to extract large amounts of data using BeautifulSoup, as well as work with Selenium and Scrapy.
Table of Contents (6 chapters)
Chapter 6
Web Scraping with Scrapy
Content Locked
Section 3
Programming a Spider
In this video, we wil show how we make Scrapy spiders collect data. We program them, focusing on their parse() method. - Create a Scrapy spider - Develop scraping commands in the Scrapy shell - Place these commands appropriately in the spider’s parse() method