Book Image

Getting Started with PhantomJS

By : Aries beltran
Book Image

Getting Started with PhantomJS

By: Aries beltran

Overview of this book

PhantomJS is a headless WebKit browser with JavaScript API that allows you to create new ways to automate web testing. PhantomJS is currently being used by a large number of users to help them integrate headless web testing into their development processes. It also gives you developers a new framework to create web-based applications, from simple web manipulation to performance measurement and monitoring.A step step-by by-step guide that will help you develop new tools for solving web and testing problems in an effective and quick way. The book will teach you how to use and maximize PhantomJS to develop new tools for web scrapping, web performance measurement and monitoring, and headless web testing. This book will help you understand PhantomJS’ scripting API capabilities and strengths.This book starts by looking at PhantomJS’ JavaScript API, features, and basic execution of scripts. Throughout the book, you will learn details to help you write scripts to manipulate web documents and fully create a web scrapping tool.Through its practical approach, this book strives to teach you by example, where each chapter focuses on the common and practical usage of PhantomJS, and how to extract meaningful information from the web and other services.By the end of the book, you will have acquired the skills to enable you to use PhantomJS for web testing, as well as learning the basics of Jasmine, and how it can be used with PhantomJS.
Table of Contents (13 chapters)
12
Index

Working with PhantomJS

Now, let's see how PhantomJS's magic works. It is a command-line-based application, so we need to execute it in an OS terminal or console. The PhantomJS package contains a series of files and comes with one main executable file, which is named phantomjs.

Open your terminal and then navigate to your PhantomJS bin folder. In the prompt, execute phantomjs without any arguments.

Tip

PhantomJS Windows build

In Windows build, PhantomJS executable can be found in the root folder with the filename phantomjs.exe.

Working with PhantomJS

Running PhantomJS without any arguments will give you an interactive prompt that is similar to the JavaScript debug console you could find in any modern browser. In this interactive prompt, we can execute JavaScript code line by line. This functionality is very useful for debugging or testing code before you actually build your script.

Say "Hello Ghost!" to PhantomJS using the interactive prompt. Using console.log will output any type of data to the output console of a JavaScript interpreter.

phantomjs> console.log("Hello Ghost!")
Hello Ghost!
undefined
phantomjs>

See? It is simple. Just like coding any JavaScript. But wait. What is that undefined message just after the Hello Ghost! message? That is not an error. It is just how the interactive mode behaves. Each call is expected to return data just like any ordinary function call and it also automatically outputs the data value to the output stream.

Since the console.log command does not return any value, the message is undefined. If we issue an assignment to a variable command, the following output will be displayed:

phantomjs> name = "Tara"
{}
phantomjs>

The assignment to a variable will take place and the result of the operation will be displayed. Because it is in the form of a string literal, the undefined message will not be displayed. The interactive mode is similar to a long-running script; any variable or function you define will be loaded into the memory buffer and can be accessed anytime during the session. So, based on our preceding example, the name variable can also be displayed by referencing it.

phantomjs> name = "Tara"
"Tara"
phantomjs> name
"Tara"
phantomjs> name + " and Cecil"
"Tara and Cecil"
phantomjs>

We can even use the variable with another operation as seen in the preceding lines of code. However, any operation's result that is not assigned to a variable will be available only during the execution of the line. The operation that concatenates the name variable with another string literal will be performed, and the resulting string will be displayed in the console but will not be kept in memory.

Objects can also be accessed within the interactive mode, and one of the most commonly used objects is phantom. Try typing phantom in the prompt and you will get the following output:

phantomjs> phantom
{
   "clearCookies": "[Function]",
   "deleteCookie": "[Function]",
   "addCookie": "[Function]",
   "injectJs": "[Function]",
   "debugExit": "[Function]",
   "exit": "[Function]",
   "cookies": [],
   "cookiesEnabled": true,
   "version": {
      "major": 1,
      "minor": 7,
      "patch": 0
   },
   "scriptName": "",
   "outputEncoding": "UTF-8",
   "libraryPath": "/Users/Aries/phantomjs/bin",
   "defaultPageSettings": {
      "XSSAuditingEnabled": false,
      "javascriptCanCloseWindows": true,
      "javascriptCanOpenWindows": true,
      "javascriptEnabled": true,
      "loadImages": true,
      "localToRemoteUrlAccessEnabled": false,
      "userAgent": "Mozilla/5.0 (Macintosh; Intel Mac OS X)AppleWebKit/534.34 (KHTML, like Gecko) PhantomJS/1.7.0Safari/534.34",
      "webSecurityEnabled": true
   },
   "args": []
}
phantomjs>

PhantomJS displays the content of the object when used in the interactive prompt, and even its own phantom object can be referenced. You may also observe that the object is displayed in the form of JSON and details every attribute of the object except for the function definition. Using this approach, we can also examine each and every object, and we will be able to know what the exposed attributes and available functions are.

Let's try using one of the most important functions available in the phantom object: the exit() function. This function will enable us to quit PhantomJS and return to the caller or to the underlying operating system.

phantomjs> phantom.exit()
$

This function signals the application to exit with a return code of zero or normal and without errors. Passing a numeric value as an argument of the exit() function denotes the error code to be passed back to the caller. This is helpful when trying to write scripts that need to verify if the execution was successful or if an error occurred and what type of error it was.

If we trap the error code in a shell script, it will look as follows:

#!/bin/bash
bin/phantomjs
OUT=$?
if [ $OUT -eq 0 ];then
   echo "Done."
else
   echo "Ooops! Failed.!"
fi

In the preceding lines of code, right after calling phantomjs, we capture the error code coming from the application using the $? function. We assign that to an OUT variable and then perform a test on it in the succeeding lines. If the error is equal to zero, then we display Done; otherwise, we say that the call failed.

$ ./trapme.sh 
phantomjs> phantom.exit(0)
undefined
Done.
$ ./trapme.sh 
phantomjs> phantom.exit(1)
undefined
Ooops! Failed.!
$

Use the interactive mode to experiment with the PhantomJS API.

Before we begin creating PhantomJS scripts, we first need to make a quick roundup of the PhantomJS JavaScript API.

Tip

Downloading the example code

You can download the example code files for all Packt books you have purchased from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.