Book Image

UI Testing with Puppeteer

By : Dario Kondratiuk
Book Image

UI Testing with Puppeteer

By: Dario Kondratiuk

Overview of this book

Puppeteer is an open source web automation library created by Google to perform tasks such as end-to-end testing, performance monitoring, and task automation with ease. Using real-world use cases, this book will take you on a pragmatic journey, helping you to learn Puppeteer and implement best practices to take your automation code to the next level! Starting with an introduction to headless browsers, this book will take you through the foundations of browser automation, showing you how far you can get using Puppeteer to automate Google Chrome and Mozilla Firefox. You’ll then learn the basics of end-to-end testing and understand how to create reliable tests. You’ll also get to grips with finding elements using CSS selectors and XPath expressions. As you progress through the chapters, the focus shifts to more advanced browser automation topics such as executing JavaScript code inside the browser. You’ll learn various use cases of Puppeteer, such as mobile devices or network speed testing, gauging your site’s performance, and using Puppeteer as a web scraping tool. By the end of this UI testing book, you’ll have learned how to make the most of Puppeteer’s API and be able to apply it in your real-world projects.
Table of Contents (12 chapters)

How to avoid being detected as a bot

I hesitated about adding this section after everything I mentioned about scraping ethics. I think I made my point clear when I said that when the owner says no, it means no. But if I'm writing a chapter about scraping, I think I need to show you these tools. It's then up to you what to do with the information you have learned so far.

Websites that don't want to be scraped, and are being actively scraped, will invest a good amount of time and money in trying not to be scraped. The effort would become even more important if the scrapers damage not only the site's performance but also the business.

Developers in charge of dealing with bots won't rely only on the user agent because, as we saw, that could be easily manipulated. They should rely only on evaluating the number of requests from an IP because, as we also saw, scrapers can slow down their scripts, simulating an interested user.

If the site can't stop...