Skip to content

Regular Expressions

import re

substituting

txt = "The rain in Spain"
x = re.sub("\s","9",txt)    # x is now "The9rain9in9Spain"

substituting with sets

txt = "01a02b03c"
x = re.sub("([a-z])","#\\1",txt)  # x is now "01#a02#b03#c"

Things to note with this

  1. the braces around [a-z] define that as a set
  2. \\1 includes the first defined set in the replacing string, note the double backslash in the string. could also have written this as r"#\1"

matching

txt = "NameD210126T10201130a01b02c03"
x = re.match("(.*)D([0-9]{6})T([0-9]*)(.*)")
# x.groups[0] -> "Name"
# x.groups[1] -> "210126"
# x.groups[2] -> "10201130"
# x.groups[3] -> "a01b02c03"