Book Image

Javascript Regular Expressions

Book Image

Javascript Regular Expressions

Overview of this book

Table of Contents (13 chapters)

Building our environment


In order to test our Regex patterns, we will build an HTML form, which will process the supplied pattern and match it against a string.

I am going to keep all the code in a single file, so let's start with the head of the HTML document:

<!DOCTYPE html>
<html lang="en">
  <head>
    <title>Regex Tester</title>
    <link rel="stylesheet" href="http://netdna.bootstrapcdn.com/bootstrap/3.0.3/css/bootstrap.min.css">
    <script src="http://cdnjs.cloudflare.com/ajax/libs/jquery/2.0.3/jquery.min.js"></script>
    <style>
      body{
        margin-top: 30px;
      }
      .label {
         margin: 0px 3px;
      }
    </style>
  </head>

Tip

Downloading the example code

You can download the example code files from your account at http://www.packtpub.com for all the Packt Publishing books you have purchased. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

It is a fairly standard document head, and contains a title and some styles. Besides this, I am including the bootstrap CSS framework for design, and the jQuery library to help with the DOM manipulation.

Next, let's create the form and result area in the body:

<body>
  <div class="container">
    <div class="row">
      <div class="col-sm-12">
        <div class="alert alert-danger hide" id="alert-box"></div>
          <div class="form-group">
            <label for="input-text">Text</label>
            <input 
                    type="text" 
                    class="form-control" 
                    id="input-text" 
                    placeholder="Text"
            >
          </div>
          <label for="inputRegex">Regex</label>
          <div class="input-group">
            <input 
                   type="text" 
                   class="form-control" 
                   id="input-regex" 
                   placeholder="Regex"
            >
            <span class="input-group-btn">
              <button 
                      class="btn btn-default" 
                      id="test-button" 
                      type="button">
                             Test!
              </button>
            </span>
          </div>
        </div>
      </div>
      <div class="row">
        <h3>Results</h3>
        <div class="col-sm-12">
          <div class="well well-lg" id="results-box"></div>
        </div>
      </div>
    </div>
    <script>
      //JS code goes here
    </script>
  </body>
</html>

Most of this code is boilerplate HTML required by the Bootstrap library for styling; however, the gist of it is that we have two inputs: one for some text and the other for the pattern to match against it. We have a button to submit the form (the Test! button) and an extra div to display the results.

Opening this page in your browser should show you something similar to this:

Handling a submitted form

The last thing we need to do is handle the form being submitted and run a regular expression. I broke the code into helper functions to help with the code flow when we go through it now. To begin with, let's write the full-click handler for the submit (Test!) button (this should go where I've inserted the comment in the script tags):

var textbox = $("#input-text");
var regexbox = $("#input-regex");
var alertbox = $("#alert-box");
var resultsbox = $("#results-box");

$("#test-button").click(function(){
  //clear page from previous run
  clearResultsAndErrors()

  //get current values
  var text = textbox.val();
  var regex = regexbox.val();

  //handle empty values
  if (text == "") {
    err("Please enter some text to test.");
  } else if (regex == "") {
    err("Please enter a regular expression.");
  } else {
    regex = createRegex(regex);

    if (!regex) {
      return;
    }

    //get matches
    var results = getMatches(regex, text);

    if (results.length > 0 && results[0] !== null) {
      var html = getMatchesCountString(results);
      html += getResultsString(results, text);
      resultsbox.html(html);
    } else {
      resultsbox.text("There were no matches.");
    }
  }
});

The first four lines select the corresponding DOM element from the page using jQuery, and store them for use throughout the application. This is a best practice when the DOM is static, instead of selecting the element each time you use it.

The rest of the code is the click handler for the submit (Test!) button. In the function that handles the Test! button, we start by clearing the results and errors from the previous run. Next, we pull in the values from the two text boxes and handle the cases where they are empty using a function called err, which we will take a look at in a moment. If the two values are fine, we attempt to create a new RegExp object and we get their results using two other functions I wrote called createRegex and getMatches, respectively. Finally, the last conditional block checks whether there were results and displays either a No Matches Found message or an element on the page that will show individual matches using getMatchesCountString to display how many matches were found and getResultsString to display the actual matches in string.

Resetting matches and errors

Now, let's take a look at some of these helper functions, starting with err and clearResultsAndErrors:

function clearResultsAndErrors() {
  resultsbox.text("");
  alertbox.addClass("hide").text("");
}

function err(str) {
  alertbox.removeClass("hide").text(str);
}

The first function clears the text from the results element and then hides the previous errors, and the second function un-hides the alert element and adds the error passed in as a parameter.

Creating a regular expression

The next function I want to take a look at is in charge of creating the actual RegExp object from the value given in the textbox:

function createRegex(regex) {
  try {
    if (regex.charAt(0) == "/") {
      regex = regex.split("/");
      regex.shift();

      var flags = regex.pop();
      regex = regex.join("/");

      regex = new RegExp(regex, flags);
    } else {
      regex = new RegExp(regex, "g");
    }
    return regex;
  } catch (e) {
    err("The Regular Expression is invalid.");
    return false;
  }
}

If you try and create a RegExp object with flags that don't exist or invalid parameters, it will throw an exception. Therefore, we need to wrap the RegExp creation in a try/catch block, so that we can catch the error and display an error for it.

Inside the try section, we will handle two different kinds of RegExp input, the first is when you use forward slashes in your expressions. In this situation, we split this expression by forward slashes, remove the first element, which will be an empty string (the text before it is the first forward slash), and then pop off the last element which is supposed to be in the form of flags.

We then recombine the remaining parts back into a string and pass it in along with the flags into the RegExp constructor. The other case we are dealing with is where you wrote a string, and then we are simply going to pass this pattern to the constructor with only the g flag, so as to get multiple results.

Executing RegExp and extracting its matches

The next function we have is for actually cycling through the regex object and getting results from different matches:

function getMatches(regex, text) {
  var results = [];
  var result;

  if (regex.global) {
    while((result = regex.exec(text)) !== null) {
      results.push(result);
    }
  } else {
    results.push(regex.exec(text));
  }

  return results;
}

We have already seen the exec command earlier and how it returns a results object for each match, but the exec method actually works differently, depending on whether the global flag (g) is set or not. If it is not set, it will constantly just return the first match, no matter how many times you call it, but if it is set, the function will cycle through the results until the last match returns null. In the function, the global flag is set, I use a while loop to cycle through results and push each one into the results array, whereas if it is not set, I simply call function once and push only if the first match on.

Next, we have a function that will create a string that displays how many matches we have (either one or more):

function getMatchesCountString(results) {
  if (results.length === 1) {
    return "<p>There was one match.</p>";
  } else {
    return "<p>There are " + results.length + " matches.</p>";
  }
}

Finally, we have function, which will cycle through the results array and create an HTML string to display on the page:

function getResultsString(results, text) {
  for (var i = results.length - 1; i >= 0; i--) {
    var result = results[i];
    var match  = result.toString();
    var prefix = text.substr(0, result.index);
    var suffix = text.substr(result.index + match.length);
    text = prefix 
      + '<span class="label label-info">' 
      + match 
      + '</span>' 
      + suffix;
  }
  return "<h4>" + text + "</h4>";
}

Inside function, we cycle through a list of matches and for each one, we cut the string and wrap the actual match inside a label for styling purposes. We need to cycle through the list in reverse order as we are changing the actual text by adding labels and also so as to change the indexes. In order to keep in sync with the indexes from the results array, we modify text from the end, keeping text that occurs before it, the same.

Testing our application

If everything goes as planned, we should now be able to test the application. For example, let's say we enter the Hello World string as the text and add the l pattern (which if you remember will be similar to entering /l/g into our application), you should get something similar to this:

Whereas, if we specify the same pattern, though without the global flag, we would only get the first match:

Of course, if you leave out a field or specify an invalid pattern, our error handling will kick in and provide an appropriate message:

With this all working as expected, we are now ready to start learning Regex by itself, without having to worry about the JavaScript code alongside it.