Mastering Python Regular Expressions

Mastering Python Regular Expressions

Overview of this book

Regular expressions are used by many text editors, utilities, and programming languages to search and manipulate text based on patterns. They are considered the Swiss army knife of text processing. Powerful search, replacement, extraction and validation of strings, repetitive and complex tasks are reduced to a simple pattern using regular expressions. Mastering Python Regular Expressions will teach you about Regular Expressions, starting from the basics, irrespective of the language being used, and then it will show you how to use them in Python. You will learn the finer details of what Python supports and how to do it, and the differences between Python 2.x and Python 3.x. The book starts with a general review of the theory behind the regular expressions to follow with an overview of the Python regex module implementation, and then moves on to advanced topics like grouping, looking around, and performance. You will explore how to leverage Regular Expressions in Python, some advanced aspects of Regular Expressions and also how to measure and improve their performance. You will get a better understanding of the working of alternators and quantifiers. Also, you will comprehend the importance of grouping before finally moving on to performance optimization techniques like the RegexBuddy Tool and Backtracking. Mastering Python Regular Expressions provides all the information essential for a better understanding of Regular Expressions in Python.

Mastering Python Regular Expressions

Credits

About the Authors

About the Reviewers

www.PacktPub.com

Preface

Free Chapter

Introducing Regular Expressions

History, relevance, and purpose

The regular expression syntax

Summary

Regular Expressions with Python

A brief introduction

Backslash in string literals

Building blocks for Python regex

Compilation flags

Python and regex special considerations

Summary

Grouping

Special cases with groups

Overlapping groups

Summary

Look Around

Look ahead

Look around and substitutions

Look behind

Look around and groups

Summary

Performance of Regular Expressions

Benchmarking regular expressions with Python

The RegexBuddy tool

Understanding the Python regex engine

Optimization recommendations

Summary

Index

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Look behind

We could safely define look behind as the opposite operation to look ahead. It tries to match behind the subexpression passed as an argument. It has a zero-width nature as well, and therefore, it won't be part of the result.

It is represented as an expression preceded by a question mark, a less-than sign, and an equals sign, ?<=, inside a parenthesis block: (?<=regex).

We could, for instance, use it in an example similar to the one we used in negative look ahead to find just the surname of someone named John McLane. To accomplish this, we could write a look behind like the following:

>>>pattern = re.compile(r'(?<=John\s)McLane')
>>>result = pattern.finditer("I would rather go out with John McLane than with John Smith or John Bon Jovi")
>>>for i in result:
...    print i.start(), i.end()
...
32 38

With the preceding look behind, we requested the regex engine to match only positions that are preceded with John and a whitespace to then consume McLane...

Mastering Python Regular Expressions

Mastering Python Regular Expressions

Overview of this book

Related Content you might be interested in

Current Title:

Mastering Python Regular Expressions

Look behind