Book Image

Java 9 Regular Expressions

By : Anubhava Srivastava
Book Image

Java 9 Regular Expressions

By: Anubhava Srivastava

Overview of this book

Regular expressions are a powerful tool in the programmer's toolbox and allow pattern matching. They are also used for manipulating text and data. This book will provide you with the know-how (and practical examples) to solve real-world problems using regex in Java. You will begin by discovering what regular expressions are and how they work with Java. This easy-to-follow guide is a great place from which to familiarize yourself with the core concepts of regular expressions and to master its implementation with the features of Java 9. You will learn how to match, extract, and transform text by matching specific words, characters, and patterns. You will learn when and where to apply the methods for finding patterns in digits, letters, Unicode characters, and string literals. Going forward, you will learn to use zero-length assertions and lookarounds, parsing the source code, and processing the log files. Finally, you will master tips, tricks, and best practices in regex with Java.
Table of Contents (15 chapters)
Title page
Credits
About the Author
About the Reviewer
www.PacktPub.com
Customer Feedback
Preface
Free Chapter
1
Getting Started with Regular Expressions

Common pitfalls and ways to avoid them while writing regular expressions


Let's discuss some common mistakes people make while building regular expressions to solve various problems.

Do not forget to escape regex metacharacters outside a character class

You learned that all the special metacharacters, such as *, +, ?, ., |, (, ), [, {, ^, $, and so on, need to be escaped if the intent is to match them literally. I often see cases where programmers leave them unescaped, thus giving a totally different meaning to the regular expression. The Java regex API that we discussed in Chapter 5, Introduction to Java Regular Expressions APIs - Pattern and Matcher Classes, throws a non-checked exception if a regex pattern is wrongly formatted and cannot be compiled.

Avoid escaping every non-word character

Some programmers overdo escaping, thinking that they need to escape every non-word character such as colon, hyphen, semicolon, forward slash, and whitespace, which is not correct. They end up writing a regular...