Book Image

UI Testing with Puppeteer

By : Dario Kondratiuk
Book Image

UI Testing with Puppeteer

By: Dario Kondratiuk

Overview of this book

Puppeteer is an open source web automation library created by Google to perform tasks such as end-to-end testing, performance monitoring, and task automation with ease. Using real-world use cases, this book will take you on a pragmatic journey, helping you to learn Puppeteer and implement best practices to take your automation code to the next level! Starting with an introduction to headless browsers, this book will take you through the foundations of browser automation, showing you how far you can get using Puppeteer to automate Google Chrome and Mozilla Firefox. You’ll then learn the basics of end-to-end testing and understand how to create reliable tests. You’ll also get to grips with finding elements using CSS selectors and XPath expressions. As you progress through the chapters, the focus shifts to more advanced browser automation topics such as executing JavaScript code inside the browser. You’ll learn various use cases of Puppeteer, such as mobile devices or network speed testing, gauging your site’s performance, and using Puppeteer as a web scraping tool. By the end of this UI testing book, you’ll have learned how to make the most of Puppeteer’s API and be able to apply it in your real-world projects.
Table of Contents (12 chapters)

Running scrapers in parallel

I'm not saying this just because I coded it, but our scraper has a pretty good structure. Every piece is separated into different functions, making it easy to identify which parts can run in parallel.

I don't want to sound repetitive, but remember, the site being scraped, in this case, Packt, is our friend and even my publisher. We don't affect the site; we want to look like normal users. We don't want to run 1,000 calls in parallel. We don't need to do that. So, we will try to run our scraper in parallel but with caution.

The good news is that we don't have to code a parallel architecture to solve this. We will use a package called puppeteer-cluster (https://www.npmjs.com/package/puppeteer-cluster). This is what this library does according to the description at npmjs:

  • Handles crawling errors
  • Auto restarts the browser in case of a crash
  • Can automatically retry if a job fails
  • Offers different concurrency...