The subject of regular expressions is quite deep, and it takes an immense amount of practice to get used to the special character syntax. Furthermore, the re module contains a vast set of methods available for performing searches using regular expressions. Upon completing the examples in this section, you should have a much deeper appreciation for how powerful regular expressions can be.
Looking Ahead or Behind
In many cases, it is useful to match a part of a pattern only if some other part will also match. For example, in the email parsing expression, the angle brackets were marked as optional. Realistically, the brackets should be paired, and the expression
should match only if both are present, or neither is. This modified version of the expression uses a positive look ahead assertion to match the pair. The look ahead assertion syntax is
There are several important changes in this version of the expression. First, the name portion is no longer optional. That means stand-alone addresses do not match, but it also prevents improperly formatted name/address combinations from matching. The positive look ahead rule after the "name" group asserts that either the remainder of the string is wrapped with a pair of angle brackets, or there is not a mismatched bracket; either both of or neither of the brackets is present. The look ahead is expressed as a group, but the match for a look ahead group does not consume any of the input text, so the rest of the pattern picks up from the same spot after the look ahead matches.
A negative look ahead assertion (
) says that the pattern does not match the text following the current point. For example, the email recognition pattern could be modified to ignore the
mailing addresses commonly used by automated systems.
The address starting with
noreply does not match the pattern, since the look ahead assertion fails.
Instead of looking ahead for
noreply in the username portion of the email address, the pattern can alternatively be written using a negative look behind assertion
after the username is matched using the syntax
Looking backward works a little differently than looking ahead, in that the expression must use a fixed-length pattern. Repetitions are allowed, as long as there is a fixed number of them (no wildcards or ranges).
A positive look behind assertion can be used to find text following a pattern using the syntax
In the following example, the expression finds Twitter handles.
The pattern matches sequences of characters that can make up a Twitter handle, as long as they are preceded by an