Delving Deeper

The subject of regular expressions is quite deep, and it takes an immense amount of practice to get used to the special character syntax. Furthermore, the re module contains a vast set of methods available for performing searches using regular expressions. Upon completing the examples in this section, you should have a much deeper appreciation for how powerful regular expressions can be.

Regular Expressions

Modifying Strings with Patterns

In addition to searching through text, re supports modifying text using regular expressions as the search mechanism, and the replacements can reference groups matched in the pattern as part of the substitution text. Use sub() to replace all occurrences of a pattern with another string.

# re_sub.py

import re

bold = re.compile(r'\*{2}(.*?)\*{2}')

text = 'Make this **bold**.  This **too**.'

print('Text:', text)
print('Bold:', bold.sub(r'\1', text))

References to the text matched by the pattern can be inserted using the \num syntax used for back-references.

$ python3 re_sub.py

Text: Make this **bold**.  This **too**.
Bold: Make this bold.  This too.

To use named groups in the substitution, use the syntax \g<name>.

# re_sub_named_groups.py

import re

bold = re.compile(r'\*{2}(?P.*?)\*{2}')

text = 'Make this **bold**.  This **too**.'

print('Text:', text)
print('Bold:', bold.sub(r'\g', text))

The \g<name> syntax also works with numbered references, and using it eliminates any ambiguity between group numbers and surrounding literal digits.

$ python3 re_sub_named_groups.py

Text: Make this **bold**.  This **too**.
Bold: Make this bold.  This too.

Pass a value to count to limit the number of substitutions performed.

# re_sub_count.py

import re

bold = re.compile(r'\*{2}(.*?)\*{2}')

text = 'Make this **bold**.  This **too**.'

print('Text:', text)
print('Bold:', bold.sub(r'\1', text, count=1))

Only the first substitution is made because count is 1.

$ python3 re_sub_count.py

Text: Make this **bold**.  This **too**.
Bold: Make this bold.  This **too**.

# re_subn.py

import re

bold = re.compile(r'\*{2}(.*?)\*{2}')

text = 'Make this **bold**.  This **too**.'

print('Text:', text)
print('Bold:', bold.subn(r'\1', text)).', 2)

The search pattern matches twice in the example.

$ python3 re_subn.py

Text: Make this **bold**.  This **too**.
Bold: ('Make this bold.  This too.', 2)