In this section, we are going to learn about regular expressions in Python. Regular expression is a specialized programming language, which is embedded in Python and is available to users through the re module. We can define the rules for the set of strings that we want to match. Using regular expressions, we can extract specific information from files, code, documents, spreadsheets, and so on.
In Python, a regular expression is denoted as re and can be imported through the re module. Regular expressions support four things:
- Identifiers
- Modifiers
- Whitespace characters
- Flags
The following table lists the identifiers, and there's a description for each one:
Identifier |
Description |
\w |
Matches alphanumeric characters, including underscore (_) |
\W |
Matches non-alphanumeric characters, excluding underscore (_) |
\d |
Matches a digit |
\D... |