Delving Deeper
The subject of regular expressions is quite deep, and it takes an immense amount of practice to get used to the special character syntax. Furthermore, the re module contains a vast set of methods available for performing searches using regular expressions. Upon completing the examples in this section, you should have a much deeper appreciation for how powerful regular expressions can be.
Regular Expressions
Modifying Strings with Patterns
In addition to searching through text, re
supports modifying text using regular expressions as the search mechanism, and the replacements can reference groups matched in the pattern as part of the substitution text. Use sub()
to replace all occurrences of a pattern with another string.
# re_sub.py import re bold = re.compile(r'\*{2}(.*?)\*{2}') text = 'Make this **bold**. This **too**.' print('Text:', text) print('Bold:', bold.sub(r'\1', text))
References to the text matched by the pattern can be inserted using the \num
syntax used for back-references.
$ python3 re_sub.py Text: Make this **bold**. This **too**. Bold: Make this bold. This too.
To use named groups in the substitution, use the syntax \g<name>
.
# re_sub_named_groups.py import re bold = re.compile(r'\*{2}(?P.*?)\*{2}') text = 'Make this **bold**. This **too**.' print('Text:', text) print('Bold:', bold.sub(r'\g ', text))
The \g<name>
syntax also works with numbered references, and using it eliminates any ambiguity between group numbers and surrounding literal digits.
$ python3 re_sub_named_groups.py Text: Make this **bold**. This **too**. Bold: Make this bold. This too.
Pass a value to count
to limit the number of substitutions performed.
# re_sub_count.py import re bold = re.compile(r'\*{2}(.*?)\*{2}') text = 'Make this **bold**. This **too**.' print('Text:', text) print('Bold:', bold.sub(r'\1', text, count=1))
Only the first substitution is made because count
is 1
.
$ python3 re_sub_count.py Text: Make this **bold**. This **too**. Bold: Make this bold. This **too**.
# re_subn.py import re bold = re.compile(r'\*{2}(.*?)\*{2}') text = 'Make this **bold**. This **too**.' print('Text:', text) print('Bold:', bold.subn(r'\1', text)).', 2)
The search pattern matches twice in the example.
$ python3 re_subn.py Text: Make this **bold**. This **too**. Bold: ('Make this bold. This too.', 2)