CAPTCHA stands for Completely Automated Public Turing test to tell Computers and Humans Apart. As the acronym suggests, it is a test to determine whether the user is human or not. A typical CAPTCHA consists of distorted text, which a computer program will find difficult to interpret but a human can (hopefully) still read. Many websites use CAPTCHA to try and prevent bots from interacting with their website. For example, my bank website forces me to pass a CAPTCHA every time I log in, which is a pain. This chapter will cover how to solve a CAPTCHA automatically, first through Optical Character Recognition (OCR) and then with a CAPTCHA solving API.

Web Scraping with Python
By :

Web Scraping with Python
By:
Overview of this book
Table of Contents (16 chapters)
Web Scraping with Python
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Introduction to Web Scraping
Scraping the Data
Caching Downloads
Concurrent Downloading
Dynamic Content
Interacting with Forms
Solving CAPTCHA
Scrapy
Index
Customer Reviews