Python - regex example
Jump to navigation
Jump to search
Contents
About
NOTE: This page is a daughter page of: Python
Here are some easy examples of how to use Regex in python to find pattern matches.
Matching Strings in Python Using Regex
Simple regex example looking for matches in a string.
simple_regex.py:
#!/usr/bin/env python
# Basic script to do some regular expression matching
# for text inside a file.
import re
def main():
haystack = 'c://1.png c://2.jpg c://3.png'
# ? = non greedy repeat, \\b = word bound, () = group
regex = '\\b([^ ]*?).png'
matches = re.findall(regex, haystack)
if matches:
print(matches) # Prints: ['c://1', 'c://3']
if __name__ == '__main__':
main()
Matching Strings from a File in Python Using Regex
Larger regex example, looking for matches in a file.
regex_facebook_friend_finder.py:
#!/usr/bin/env python
# Does some regular expression matching for text inside a
# file and writes unique answers out to another file.
# In this example we want to isolate the unique Facebook ids
# of our friends from a webpage....
# From: "https://www.facebook.com/felix.borgmann?fref=p"
# We want: "felix.borgmann"
import re
def main():
# Open file and get contents:
file_in = open('facebook_friends.html', 'r')
haystack = file_in.read()
file_in.close()
# Perform regex to get all matches:
regex = 'href="https://www.facebook.com/(.{1,50})\?fref'
matches = re.findall(regex, haystack)
if not matches:
print('no matches')
return
# Eliminate duplicates and sort alphabetically:
facebook_friend_ids = set(matches) # Only want unique ones.
facebook_friend_ids = sorted(list(facebook_friend_ids))
# Write answers out to file
file_out = open('facebook_friends_ids.txt', 'w')
for friend_id in facebook_friend_ids:
file_out.write(friend_id + '\n')
file_out.close()
# Will write out 'felix.borgman' and 'jenny.noski'
if __name__ == '__main__':
main()
facebook_friends.html:
<a href="https://www.facebook.com/jenny.noski?fref=pb">
<a href="https://www.someotherurl.com/">
<a href="https://www.facebook.com/felix.borgmann?fref=pb">
<a href="https://www.facebook.com/jenny.noski?fref=pb">