For working professionals
For fresh graduates
More
13. Print In Python
15. Python for Loop
19. Break in Python
23. Float in Python
25. List in Python
27. Tuples in Python
29. Set in Python
53. Python Modules
57. Python Packages
59. Class in Python
61. Object in Python
73. JSON Python
79. Python Threading
84. Map in Python
85. Filter in Python
86. Eval in Python
96. Sort in Python
101. Datetime Python
103. 2D Array in Python
104. Abs in Python
105. Advantages of Python
107. Append in Python
110. Assert in Python
113. Bool in Python
115. chr in Python
118. Count in python
119. Counter in Python
121. Datetime in Python
122. Extend in Python
123. F-string in Python
125. Format in Python
131. Index in Python
132. Interface in Python
134. Isalpha in Python
136. Iterator in Python
137. Join in Python
140. Literals in Python
141. Matplotlib
144. Modulus in Python
147. OpenCV Python
149. ord in Python
150. Palindrome in Python
151. Pass in Python
156. Python Arrays
158. Python Frameworks
160. Python IDE
164. Python PIP
165. Python Seaborn
166. Python Slicing
168. Queue in Python
169. Replace in Python
173. Stack in Python
174. scikit-learn
175. Selenium with Python
176. Self in Python
177. Sleep in Python
179. Split in Python
184. Strip in Python
185. Subprocess in Python
186. Substring in Python
195. What is Pygame
197. XOR in Python
198. Yield in Python
199. Zip in Python
Ever felt like searching for a word in a paragraph was tougher than finding your slippers during a power cut? That’s where Python Regular Expressions come in. Regular expressions, also called RegEx, are tools to match patterns in text. You can think of them as CTRL+F on steroids!
They don’t just search, they recognize complex patterns and help extract, replace, or validate content. In India, whether you’re filtering Aadhaar numbers, checking mobile numbers, or validating email formats for job applications, regex saves time and effort. Let’s walk through the mystical yet practical world of regex in Python.
Pursue our Software Engineering courses to get hands-on experience!
Python Regular Expressions are a sequence of characters used to define a search pattern. These patterns are used with the re module to match or compare strings, find substrings, or even replace them. They come in handy for validating inputs, scraping data, and many day-to-day automation tasks.
Take your skills to the next level with these top programs:
Python includes a built-in package named re that allows you to work with Regular Expressions efficiently. Here’s the syntax to import the re module:
import re
Explanation: Before using any regular expressions, we need to import the re module. It contains all the handy functions like match(), search(), findall(), and more that help us work with regex. Without this, Python won’t understand what regex even means!
Here are some of the common Python RegEX functions:
Function | Description |
re.match() | Matches a pattern at the beginning of the string |
re.search() | Searches the entire string for a pattern |
re.findall() | Returns a list of all matches |
re.sub() | Replaces the pattern with a new string |
Let’s break these down with examples:
import re
result = re.match("Ram", "Ram went to school")
print(result)
Output:
<re.Match object; span=(0, 3), match='Ram'>
Explanation: The match() function checks whether the string starts with "Ram". Since it does, we get a match object. If it started with something else, it would return None. It’s like checking if your train ticket has today’s date—if not, you’re not boarding. This is efficient when you want to check headers, titles, or specific starting points in text.
import re
result = re.search("Delhi", "Mumbai to Delhi via train")
print(result)
Output:
<re.Match object; span=(11, 16), match='Delhi'>
Explanation: search() looks through the entire string to find a match, regardless of its position. Think of it like your mom searching for her specs all over the house—they’re found eventually! Ideal for checking if a keyword or phrase exists in large chunks of data.
import re
text = "My PIN codes are 110001 and 560034."
result = re.findall(r"\d{6}", text)
print(result)
Output:
['110001', '560034']
Explanation: Here, \d{6} looks for exactly 6 digits, matching Indian PIN codes. findall() returns all matches in a list. Super useful if you’re scanning a document for all phone numbers or OTPs. It acts like a smart assistant collecting all relevant data points in one go.
import re
text = "This is bad. That is bad too."
updated_text = re.sub("bad", "awesome", text)
print(updated_text)
Output:
This is awesome. That is awesome too.
Explanation: The sub() method replaces all instances of "bad" with "awesome". Great for editing out those ‘unwanted’ terms from user comments or tweets. You can also use it to mask sensitive info—like replacing email IDs with asterisks for privacy.
Also Read: 16+ Essential Python String Methods You Should Know (With Examples) article!
Here are some of the commonly used regex patterns:
Pattern | Meaning | Example Match |
\d | Any digit (0-9) | 5, 9 |
\D | Non-digit character | a, # |
\w | Word character (a-z, A-Z, 0-9, _) | A, 7, _ |
\W | Non-word character | !, @ |
\s | Whitespace character | space, tab |
\S | Non-whitespace character | a, 9, % |
. | Any character except newline | a, B, %, 1 |
^ | Starts with | ^Hello matches "Hello world" |
$ | Ends with | world$ matches "Hello world" |
import re
text = "Contact: 9876543210"
match = re.search(r"[6-9]\d{9}", text)
print(match.group())
Output:
9876543210
Explanation: This pattern ensures the number starts with 6-9 and has 10 digits in total. Perfect for verifying mobile numbers in India. It's commonly used by e-commerce apps during signup or OTP validation.
Grouping in Python Regular Expressions allows you to isolate and extract specific parts of a match using parentheses (). It’s like filling multiple tiffin boxes from a big pot of biryani - you don't just take the whole pot, you take only the portions you want, neatly separated.
Each set of parentheses in a regex pattern defines a capture group, and when the regex matches a string, it stores each group’s result separately. This is particularly helpful when you want to extract structured data - like separating an STD code from a phone number, or a date into day, month, and year.
These groups can then be accessed using group() for a single match or groups() to get all matched groups as a tuple.
import re
text = "STD Code: 080, Number: 23456789"
match = re.search(r"(\d{3}), Number: (\d{8})", text)
print(match.groups())
Output:
('080', '23456789')
Explanation: Here, groups() returns a tuple of matched groups. Handy when you need to separate the area code from the number—like BSNL used to do! It gives you organized access to data, like slicing laddoos into neat pieces.
Regex flags modify how patterns behave. They’re like extra filters added to your sunglasses - changing how you see the string.
Flag | Description |
re.IGNORECASE | Ignores case differences (A == a) |
re.MULTILINE | ^ and $ match start/end of each line |
re.DOTALL | Makes . match newline characters as well |
import re
text = "Welcome to Delhi"
match = re.search("delhi", text, re.IGNORECASE)
print(match.group())
Output:
Delhi
Explanation: Without the IGNORECASE flag, "delhi" wouldn’t match "Delhi". This flag is useful when dealing with user input in different capitalizations, like names or cities.
What You Want to Match | Regex Pattern |
Indian Mobile Number | [6-9]\d{9} |
PAN Card | [A-Z]{5}[0-9]{4}[A-Z] |
PIN Code | \d{6} |
\w+@\w+\.\w{2,3} | |
Vehicle Number (MH12 XY 1234) | [A-Z]{2}\d{2} [A-Z]{2} \d{4} |
IFSC Code | [A-Z]{4}0[A-Z0-9]{6} |
Here are some of the common mistakes to avoid:
Note:
re.search(r"[A-Z]{5}[0-9]{4}[A-Z]", "ABCDE1234F")
Explanation: This pattern matches valid Indian PAN numbers—5 uppercase letters, 4 digits, and 1 letter. Your code’s version of KYC! It’s essential for financial applications or backend KYC workflows.
Here are some of the Python regular expressions real-world used cases:
Python Regular Expressions are incredibly powerful tools for text processing. With the re module, you can search, match, and manipulate strings with precision and speed. From checking the format of a PIN code to scraping phone numbers from a messy webpage, regex proves its worth in countless real-world applications.
Sure, regex might seem intimidating at first—kind of like deciphering your grandmother’s secret masala blend—but once you understand the patterns, you’re on your way to automating everything from form validations to data cleanup. It’s like giving your code a sixth sense for spotting order in chaos.
Start small, test often, and soon you’ll be writing regex like a jugaadu pro! Next time you’re tangled in a text problem, think regex—because even the messiest data has a pattern. And hey, for practice, try building a form validator for a school admission form. Validate name, mobile number, email, and PIN code.
Python Regular Expressions are used for pattern matching in strings. They help validate input data like emails, PIN codes, or phone numbers, extract specific patterns, and clean or modify large datasets efficiently.
The re module in Python supports all regular expression operations. You must import it before using functions like match(), search(), or findall() to perform regex-related tasks in your programs.
match() checks for a pattern only at the beginning of a string, while search() scans the entire string for a match. If the pattern appears later, match() will return None, but search() can still succeed.
Yes, regular expressions are ideal for validating PAN numbers using the pattern [A-Z]{5}[0-9]{4}[A-Z]. It ensures correct format: five letters, four digits, and one letter—common for income tax forms in India.
Use re.findall() to extract all matching values from a string. It returns a list of results. This is helpful when scanning for multiple phone numbers, PIN codes, or dates in a single document.
Yes, regex in Python is case-sensitive unless you use the re.IGNORECASE flag. This allows you to match uppercase and lowercase versions of text like names, cities, or states, regardless of user input format.
You can use re.sub() to find and replace patterns within a string. It’s useful for cleaning up dirty data, replacing slang in text, or masking sensitive information like Aadhaar or email addresses.
The \d pattern matches any digit from 0 to 9. It's used to locate numeric values like mobile numbers, OTPs, or invoice IDs. If you want 6-digit PIN codes, use \d{6} instead.
Yes, use [6-9]\d{9} to validate 10-digit Indian mobile numbers starting with digits 6 to 9. It ensures that the number format is correct and avoids matching landlines or invalid inputs.
Groups are defined using parentheses () and are used to extract sub-patterns from a match. They're helpful when splitting a phone number into STD code and number, or separating date, month, and year fields.
Raw strings (r"") prevent Python from misinterpreting backslashes in your pattern. Without raw strings, \d could throw an error or not work as expected, leading to bugs in pattern matching.
Use the re.DOTALL flag to make the dot . match newline characters as well. This is useful when you want to match entire paragraphs or multiline texts without missing line breaks.
The ^ anchor asserts the start of a string, and $ asserts the end. They're used to check if a string starts or ends with specific words—ideal for validating headers or footer content in documents.
Yes, regex can extract specific patterns like product IDs, prices, or dates from raw HTML. But for structured data, it’s better to combine regex with libraries like BeautifulSoup or Scrapy for cleaner results.
Take our Free Quiz on Python
Answer quick questions and assess your Python knowledge
Author|900 articles published
Previous
Next
Talk to our experts. We are available 7 days a week, 9 AM to 12 AM (midnight)
Indian Nationals
1800 210 2020
Foreign Nationals
+918068792934
1.The above statistics depend on various factors and individual results may vary. Past performance is no guarantee of future results.
2.The student assumes full responsibility for all expenses associated with visas, travel, & related costs. upGrad does not provide any a.