Book Image

JavaScript Regular Expressions

By : Gabriel Manricks, Loiane Groner, Loiane Groner [Duplicate entry]
Book Image

JavaScript Regular Expressions

By: Gabriel Manricks, Loiane Groner, Loiane Groner [Duplicate entry]

Overview of this book

Table of Contents (13 chapters)

Regex in JavaScript

In JavaScript, regular expressions are implemented as their own type of object (such as the RegExp object). These objects store patterns and options and can then be used to test and manipulate strings.

To start playing with regular expressions, the easiest thing to do is to enable a JavaScript console and play around with the values. The easiest way to get a console is to open up a browser, such as Chrome, and then open the JavaScript console on any page (press the command + option + J on a Mac or Ctrl + Shift + J).

Let's start by creating a simple regular expression; we haven't yet gotten into the specifics of the different special characters involved, so for now, we will just create a regular expression that matches a word. For example, we will create a regular expression that matches hello.

The RegExp constructor

Regular expressions can be created in two different ways in JavaScript, similar to the ones used in strings. There is a more explicit definition, where you call the constructor function and pass it the pattern of your choice (and optionally any settings as well), and then, there is the literal definition, which is a shorthand for the same process. Here is an example of both (you can type this straight into the JavaScript console):

var rgx1 = new RegExp("hello");
var rgx2 = /hello/;

Both these variables are essentially the same, it's pretty much a personal preference as to which you would use. The only real difference is that with the constructor method you use a string to create an expression: therefore, you have to make sure to escape any special characters beforehand, so it gets through to the regular expression.

Besides a pattern, both forms of Regex constructors accept a second parameter, which is a string of flags. Flags are like settings or properties, which are applied on the entire expression and can therefore change the behavior of both the pattern and its methods.

Using pattern flags

The first flag I would like to cover is the ignore case or i flag. Standard patterns are case sensitive, but if you have a pattern that can be in either case, this is a good option to set, allowing you to specify only one case and have the modifier adjust this for you, keeping the pattern short and flexible.

The next flag is the multiline or m flag, and this makes JavaScript treat each line in the string as essentially the start of a new string. So, for example, you could say that a string must start with the letter a. Usually, JavaScript would test to see if the entire string starts with the letter a, but with the m flag, it will test this constraint against each line individually, so any of the lines can pass this test by starting with a.

The last flag is the global or g flag. Without this flag, the RegExp object only checks whether there is a match in the string, returning on the first one that's found; however, in some situations, you don't just want to know if the string matches, you may want to know about all the matches specifically. This is where the global flag comes in, and when it's used, it will modify the behavior of the different RegExp methods to allow you to get to all the matches, as opposed to only the first.

So, continuing from the preceding example, if we wanted to create the same pattern, but this time, with the case set as insensitive and using global flags, we would write something similar to this:

var rgx1 = new RegExp("hello", "gi");
var rgx2 = /hello/gi;

Using the rgx.test method

Now that we have created our regular expression objects, let's use its simplest function, the test function. The test method only returns true or false, based on whether a string matches a pattern or not. Here is an example of it in action:

> var rgx = /hello/;
> rgx.test("hello");
> rgx.test("world");
> rgx.test("hello world");

As you can see, the first string matches and returns true, and the second string does not contain hello, so it returns false, and finally the last string matches the pattern. In the pattern, we did not specify that the string had to only contain hello, so it matches the last string and returns true.

Using the rgx.exec method

The next method on the RegExp object, is the exec function, which, instead of just checking whether the pattern matches the text or not, exec also returns some information about the match. For this example, let's create another regular expression, and get index for the start of the pattern;

> var rgx = /world/;
> rgx.exec("world !!");
[ 'world' ]
> rgx.exec("hello world");
[ 'world' ]
> rgx.exec("hello");

As you can see here, the result from the function contains the actual match as the first element (rgx.exec("world !!")[0];) and if you console.dir the results, you will see it also contains two properties: index and input, which store the starting index property and complete the input text, respectively. If there are no matches, the function will return null:

The string object and regular expressions

Besides these two methods on the RegExp object itself, there are a few methods on the string object that accept the RegExp object as a parameter.

Using the String.replace method

The most commonly used method is the replace method. As an example, let's say we have the foo foo string and we want to change it to qux qux. Using replace with a string would only switch the first occurrence, as shown here:

In order to replace all the occurrences, we need to supply a RegExp object that has the g flag, as shown here:

Using the method

Next, if you just want to find the (zero-based) index of the first match in a string, you can use the search method:

> str = "hello world";
"hello world"

Using the String.match method

The last method I want to talk about right now is the match function. This function returns the same output as the exec function we saw earlier when there was no g flag (it includes the index and input properties), but returned a regular Array of all the matches when the g flag was set. Here is an example of this:

We have taken a quick pass through the most common uses of regular expressions in JavaScript (code-wise), so we are now ready to build our RegExp testing page, which will help us explore the actual syntax of Regex without combining it with JavaScript code.