Extract All Regex Matches With Python

import re
def extractImages(filename):
imgReg = re.compile("../assets/(.*jpg|.*png)")
with open(filename, mode="rt", encoding="utf-8") as docFile:
doc = docFile.read()
images = re.findall(imgReg, doc)
return ["./assets/" + img for img in images]
# later used in e.g. [os.remove(img) for img in extractImages(filename)]
# above deletes all images located in ./assets/<filename>.jpg|png
  1. define regex and assign it within the regex compile object with reg = re.compile(<regex>). You of course need to define a capture group to be extracted I.e. with http://pavol.kutaj.com/assets/(.*jpg|.*png) — only the image filename and not http://pavol.kutaj.com/assets/ location is extracted
  2. open the file with with open(...) as alias: statement
  3. assign the content of the file with inputObj = alias.read()
  4. assign the list of matches with matches = re.findall(inputObj, reg)

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Pavol Kutaj

Pavol Kutaj

Infrastructure Support Engineer/Technical Writer (Snowplow Analytics) with a passion for Python/writing documentation. More about me: https://pavol.kutaj.com