python logo

python regex


Python hosting: Host, run, and code Python in the cloud!

Regular expressions are essentially a highly specialized programming language embedded inside Python that empowers you to specify the rules for the set of possible strings that you want to match.

In Python you need the re module for regular expressions usage. The grammar overview is on the bottom of this page.

Related course:
Python Programming Bootcamp: Go from zero to hero

The Match function


The match function is defined as:

re.match(pattern, string)

The parameters are:

If you want to match a string to a numberic sequence of exactly five, you can use this code:

Parameters Description
pattern a regular expression
string the input string
#!/usr/bin/python
import re

input = raw_input("Enter an input string:")
m = re.match('\d{5}\Z',input)

if m:
print("True")
else:
print("False")

Example outputs:

Email validation regex


We can use the same function to validate email address. The grammar rules are seen in re.compile and in the grammar table.

String Match
12345 True
12358 True
55555 True
123 False
123K5 False
5555555 False
#!/usr/bin/python
import re

input = raw_input("Enter an input string:")
m = re.match('[^@]+@[^@]+\.[^@]+',input)

if m:
print("True")
else:
print("False")

The Search Function


The search function is defined as:

re.search(pattern, string)

The parameters are:

To search if an e-mail address is in a string:

Parameter Description
pattern a regular expression, defines the string to be searched
string the search space
#!/usr/bin/python
import re

input = "Contact me by [email protected] or at the office."

m = re.search('[^@]+@[^@]+\.[^@]+',input)

if m:
print("String found.")
else:
print("Nothing found.")

Regular Expression Examples


A few examples of regular expressions:

Regular Expression Grammar


Overview of the regex grammar:
Example Regex
IP address (([2][5][0-5]\.)|([2][0-4][0-9]\.)|([0-1]?[0-9]?[0-9]\.)){3}(([2][5][0-5])|([2][0-4][0-9])|([0-1]?[0-9]?[0-9]))
Email [^@]+@[^@]+\.[^@]+
Date MM/DD/YY (\d+/\d+/\d+)
Integer (positive) (?<![-.])\b[0-9]+\b(?!\.[0-9])
Integer [+-]?(?<!\.)\b[0-9]+\b(?!\.[0-9])
Float (?<=>)\d+.\d+|\d+
Hexadecimal \s–([0-9a-fA-F]+)(?:–)?\s

BackNext





Leave a Reply:




Falcon Tue, 17 Nov 2015

hi frank, I want to ask something. I did the example for the match function but all the string that i entered came out "false"
I even tried with your example of string and all came out with the same answer which is "false"
I am using python 3.5.0
did I missed something?
Thank you for your time

Frank Fri, 20 Nov 2015

Hi Falcon, this is strange. Which strings did you try?

Falcon Mon, 07 Dec 2015

hi frank, sorry for the late reply. I tried all the example output from 12345, 12358,55555,123,123K5, and 555555. It all came out "false"

Frank Tue, 08 Dec 2015

This is very strange. Is your grammar string '\d{5}\Z' ?
Try to copy the code exactly, change input to '12345' and use a Python install on another computer. Something is wrong with the configuration I think. The code works here on 2.5, 2.7 and 3.4

Falcon Tue, 22 Dec 2015

thank you frank. it worked. i use version 2.7 and it worked. maybe there is something in version 3.5 that made it wrong. anyway thanks for your replies

Copyright © 2015 - 2023 - Pythonspot.  | Cookie policy | Terms of use | Privacy policy
Regex Description
\d Matches any decimal digit; this is equivalent to the class [0-9]
\D Matches any non-digit character; this is equivalent to the class [^0-9].
\s Matches any whitespace character; this is equivalent to the class [ \t\n\r\f\v].
\S Matches any non-whitespace character; this is equivalent to the class [^ \t\n\r\f\v].
\w Matches any alphanumeric character; this is equivalent to the class [a-zA-Z0-9_].
\W Matches any non-alphanumeric character; this is equivalent to the class [^a-zA-Z0-9_].
\Z Matches only at end of string
[..] Match single character in brackets
[^..] Match any single character not in brackets
. Match any character except newline
$ Match the end of the string
* Match 0 or more repetitions
+ 1 or more repetitions
{m} Exactly m copies of the previous RE should be matched.
| Match A or B. A|B
? 0 or 1 repetitions of the preceding RE
[a-z] Any lowercase character
[A-Z] Any uppercase character
[a-zA-Z] Any character
[0-9] Any digit