Book Image

ModSecurity 2.5

Book Image

ModSecurity 2.5

Overview of this book

With more than 67% of web servers running Apache and web-based attacks becoming more and more prevalent, web security has become a critical area for web site managers. Most existing tools work on the TCP/IP level, failing to use the specifics of the HTTP protocol in their operation. Mod_security is a module running on Apache, which will help you overcome the security threats prevalent in the online world. A complete guide to using ModSecurity, this book will show you how to secure your web application and server, and does so by using real-world examples of attacks currently in use. It will help you learn about SQL injection, cross-site scripting attacks, cross-site request forgeries, null byte attacks, and many more so that you know how attackers operate. Using clear, step-by-step instructions this book starts by teaching you how to install and set up ModSecurity, before diving into the rule language with examples. It assumes no prior knowledge of ModSecurity, so as long as you are familiar with basic Linux administration, you can start to learn right away. Real-life case studies are used to illustrate the dangers on the Web today ñ you will for example learn how the recent worm that hit Twitter works, and how you could have used ModSecurity to stop it in its tracks. The mechanisms behind these and other attacks are described in detail, and you will learn everything you need to know to make sure your server and web application remain unscathed on the increasingly dangerous web. Have you ever wondered how attackers figure out the exact web server version running on a system? They use a technique called HTTP fingerprinting, and you will learn about this in depth and how to defend against it by flying your web server under a "false flag". The last part of the book shows you how to really lock down a web application by implementing a positive security model that only allows through requests that conform to a specific, pre-approved model, and denying anything that is even the slightest bit out of line.
Table of Contents (17 chapters)
ModSecurity 2.5
Credits
About the Author
About the Reviewers
Preface
Directives and Variables
Index

Example of a regular expression


To get a feeling for how regular expressions are used, let's start with a real-life example so that you can see how a regex works when put to use on a common task.

Identifying an email address

Suppose you wanted to extract all email addresses from an HTML document. You'd need a regular expression that would match email addresses but not all the other text in the document. An email address consists of a username, an @ character, and a domain name. The domain name in turn consists of a company or organization name, a dot, and a top-level domain name such as com, edu, or de.

Knowing this, here are the parts that we need to put together to create a regular expression:

  • User name

    This consists of alphanumeric characters (0-9, a-z, A-Z) as well as dots, plus signs, dashes, and underscores. Other characters are allowed by the RFC specification for email addresses, but these are very rarely used so I have not included them here.

  • @ character

    One of the mandatory characters in an email address, the @ character must be present, so it is a good way to help distinguish an email address from other text.

  • Domain Name

    For example cnn.com. Could also contain sub-domains, so mail.cnn.com would be valid

  • Top-Level Domain

    This is the fi nal part of the email address, and is part of the domain name. The top-level domain usually indicates what country the domain is located in (though domains such as .com or .org are used in countries all around the world). The top-level domain is between two and four characters long (excluding the dot character that precedes it).

Putting all these parts together, we end up with the following regular expression:

\b[-\w.+]+@[\w.]+\.[a-zA-Z]{2,4}\b

The [-\w.+]+ part corresponds to the username, the @ character is matched literally against the @ in the email address, and the domain name part corresponds to [\w.]+\.\w{2,4}. Unless you are already familiar with regular expressions, none of this will make sense to you right now, but by the time you've finished reading this appendix, you will know exactly what this regular expression does.

Let's get started learning about regular expressions, and at the end of the chapter we'll come back to this example to see exactly how it works.