Python - regex example

From NoskeWiki
Jump to navigation Jump to search

About

NOTE: This page is a daughter page of: Python


Here are some easy examples of how to use Regex in python to find pattern matches.


Matching Strings in Python Using Regex

Simple regex example looking for matches in a string.

simple_regex.py:

#!/usr/bin/env python
# Basic script to do some regular expression matching
# for text inside a file.

import re

def main():
  haystack = 'c://1.png c://2.jpg c://3.png'
  # ? = non greedy repeat, \\b = word bound, () = group
  regex = '\\b([^ ]*?).png'
  matches = re.findall(regex, haystack)
  if matches:
    print(matches)  # Prints: ['c://1', 'c://3']

if __name__ == '__main__':
    main()


Matching Strings from a File in Python Using Regex

Larger regex example, looking for matches in a file.

regex_facebook_friend_finder.py:

#!/usr/bin/env python
# Does some regular expression matching for text inside a
# file and writes unique answers out to another file.
# In this example we want to isolate the unique Facebook ids
# of our friends from a webpage....
#   From:    "https://www.facebook.com/felix.borgmann?fref=p"
#   We want: "felix.borgmann"

import re

def main():
  # Open file and get contents:
  file_in = open('facebook_friends.html', 'r')
  haystack = file_in.read()
  file_in.close()

  # Perform regex to get all matches:
  regex = 'href="https://www.facebook.com/(.{1,50})\?fref'
  matches = re.findall(regex, haystack)
  if not matches:
    print('no matches')
    return
    
  # Eliminate duplicates and sort alphabetically:
  facebook_friend_ids = set(matches)  # Only want unique ones.
  facebook_friend_ids = sorted(list(facebook_friend_ids))

  # Write answers out to file
  file_out = open('facebook_friends_ids.txt', 'w')
  for friend_id in facebook_friend_ids:
    file_out.write(friend_id + '\n')
  file_out.close()
  # Will write out 'felix.borgman' and 'jenny.noski'

if __name__ == '__main__':
    main()


facebook_friends.html:

<a href="https://www.facebook.com/jenny.noski?fref=pb">
<a href="https://www.someotherurl.com/">
<a href="https://www.facebook.com/felix.borgmann?fref=pb">
<a href="https://www.facebook.com/jenny.noski?fref=pb">


Links