Book Image

Getting Started with PhantomJS

By : Aries beltran
Book Image

Getting Started with PhantomJS

By: Aries beltran

Overview of this book

PhantomJS is a headless WebKit browser with JavaScript API that allows you to create new ways to automate web testing. PhantomJS is currently being used by a large number of users to help them integrate headless web testing into their development processes. It also gives you developers a new framework to create web-based applications, from simple web manipulation to performance measurement and monitoring.A step step-by by-step guide that will help you develop new tools for solving web and testing problems in an effective and quick way. The book will teach you how to use and maximize PhantomJS to develop new tools for web scrapping, web performance measurement and monitoring, and headless web testing. This book will help you understand PhantomJS’ scripting API capabilities and strengths.This book starts by looking at PhantomJS’ JavaScript API, features, and basic execution of scripts. Throughout the book, you will learn details to help you write scripts to manipulate web documents and fully create a web scrapping tool.Through its practical approach, this book strives to teach you by example, where each chapter focuses on the common and practical usage of PhantomJS, and how to extract meaningful information from the web and other services.By the end of the book, you will have acquired the skills to enable you to use PhantomJS for web testing, as well as learning the basics of Jasmine, and how it can be used with PhantomJS.
Table of Contents (13 chapters)
12
Index

PhantomJS JavaScript API

PhantomJS runs JavaScript and comes with a JavaScript API to make your life easy. It extends the standard JavaScript API and adds richer layers of capabilities, such as allowing us access to the underlying file system, ease of access and manipulation of DOM objects, system and environment variable gathering. It even gives us the ability to inject custom scripts into the web page.

But, be warned. PhantomJS is a very active community, and every now and then, changes are being introduced. New APIs and objects are being added, but of course, there are few items that are being changed, and ultimately some of them become deprecated or are totally removed. The PhantomJS website has a full list of all the functions and syntax, and has proper tagging for deprecated functions. We should visit it regularly to check for upcoming updates so that we can adjust appropriately in our codes.

The Module API

While writing custom objects and API sets, you may want to create custom modules that will make your life easier. You can do that in PhantomJS using the Module API. It allows you to create your own modules and import it anywhere in your implementation. The built-in modules are webpage, system, fs (File System), and webserver.

The WebPage API

PhantomJS is a headless browser. Accessing and manipulating web documents are its core functionalities, and that's what the WebPage API is used for. The WebPage API allows us to access, control, and manipulate web documents. It provides a rich interface to easily reference and extract page details including document content. It enables capturing of events, such as page loading, when an error occurs within the page, or when navigating to another page is requested, and so on. It is also capable of capturing pages and saving them as images. And more importantly, it allows you to manipulate documents on the fly and traverse DOM as you do with any web page. This is enormously valuable for writing automated user interface tests—for example, you can force click events or post forms, and capture the results—as well as standard web scraping of public URLs.

The System API

The System module provides system-level functionalities ranging from OS information, environment variables, command-line arguments, and process-related properties. The System module is very useful as you engage more in developing applications with PhantomJS.

The FileSystem API

Accessing files, writing to text files, or just reading a custom configuration file—these are tasks that can be done with PhantomJS. FileSystem provides a standard API to perform file I/O. You can read, write, and delete files; you can even list folder files.

FileSystem has 31 functions to manage and manipulate files within PhantomJS. We have been using this set extensively, and it helps solve several problems without fail. Writing JSON data to a file is very basic, but you will find it much easier using the FileSystem API.

The WebServer API

Perhaps you have a grand idea, but it requires you to process a web request and execute a PhantomJS script on it. PhantomJS can do that; you can embed your own web server implementation using the WebServer API within your PhantomJS application. This feature is marked as experimental, but it does work, and with roughly five lines of code you can have your own web server running.

This module is based on the open source Mongoose web server library that supports multiple platforms, authorization, web sockets, URL rewrite, and even "resumeable" downloads. For more information about Mongoose, visit http://code.google.com/p/mongoose/.