String Replace in Python | Python String Replace [2021]

Replacing characters and strings in Python is a crucial task when it comes to Data Cleaning or Text Processing. Your data might have formatting issues with garbage characters that need to be removed, the categories might be having spelling issues, etc. Also while text preprocessing for NLP based problems, string replacement is the most basic and important step while preparing the textual data.

In this tutorial, we will be going over multiple ways to replace different types of strings. By the end of this tutorial, you will have the knowledge of the following:

  • Python replace() method
  • Regex sub() method
  • join() and filter()
  • Replacing numeric data in strings

Python replace()

The replace(old_str, new_str, count) method consists of 3 arguments:

  • old_str: The string or a part of the string that needs to be replaced
  • new_str: The string with which the old string needs to be replaced
  • count: The count of times the particular string needs to be replaced

Let’s go over a few examples to understand the working.

Single replace

Mystr = “This is a sample string”
Newstr = Mystr.replace(‘is’, ‘was’)

 

#Output:
Thwas was a sample string

If you recall, Strings in Python are immutable. So when we call the replace method, it essentially makes another string object with the modified data. Moreover, we didn’t specify the count parameter in the above example. If not specified, the replace method will replace all the occurrences of the string.

Multiple replace

Mystr = “This is a sample string”
Newstr = Mystr.replace(“s”, “X”)

 

#Output:
ThiX iX a Xample Xtring

Multiple replace first n occurrences

If you only want first N occurrences, 

Mystr = “This is a sample string”
Newstr = Mystr.replace(“s”, “X”, 3)

 

#Output:
ThiX iX a Xample string

Multiple strings replace

In the above examples, we replaced one string a different number of times. Now what if you want to replace different strings in the same big string. We can write an effective function for it and get it done using the same method.

Consider the example as above, but now we want to replace “h”, “is” and “ng” with “X”.

def MultipleStrings(mainStr, strReplaceList, newStr):
  # Iterating over the strings to be replaced
  for elem in strReplaceList:
      # Checking if string is in the main string
      if elem in mainStr :
          # Replace the string
          mainStr = mainStr.replace(elem, newStr)
 
  return  mainStr

 

Mystr = “This is a sample string”
Newstr = MultipleStrings(Mystr, [‘h’, ‘is’, ‘ng’] , “X”)

 

#Output:
TXX X a sample striX

Read: Python Tutorial

Replacing with regex

Python’s regex is a module specifically for dealing with text data – be it finding substrings, replacing strings or anything. Regex has the sub() function to find and replace/substitute substrings easily. Let’s go over its syntax and a few use cases. 

The regex.sub(pattern, replacement, original_string) function takes 3 arguments:

  • pattern: the substring that needs to be matched and replaced.
  • replacement: can be a string which needs to be put in place, or a callable function which returns the value that needs to be put in place.
  • original_string: the main string in which the substring has to be replaced.

Same as the replace method, regex also creates another string object with the modified string. Let’s go over a few working examples.

Removing whitespace

Whitespaces can be treated as special characters and replaced with other characters. In the below example, we intend to replace whitespaces with “X”.

import re
Mystr = “This is a sample string”
# Replace all whitespaces in Mystr with ‘X’
Newstr = re.sub(r”\s+”, ‘X’, Mystr)

 

#Output:
ThisXisXaXsampleXstring


As we see, all the whitespaces were replaced. The pattern is given by r”\s+” which means all the whitespace characters. 

Removing all special characters

To remove all the special characters, we will pass a pattern which matches with all the special characters.

import re
import string
Mystr = “Tempo@@&[(000)]%%$@@66isit$$#$%-+Str”
pattern = r'[‘ + string.punctuation + ‘]’
# Replace all special characters in a string with X
Newstr = re.sub(pattern, ‘X’, Mystr)

 

#Output:
TempoXXXXX000XXXXXXX66isitXXXXXXXStr

Removing substring as case insensitive

In real life data, there might be cases where there might be many versions of the same word with different upper and lower case characters. To remove them all, putting all the words separately as the pattern wouldn’t be effective. The regex sub() function takes the flag re.IGNORECASE to ignore the cases. Let’s see how it works.

 

import re
Mystr = “This IS a sample Istring”
# Replace substring in a string with a case-insensitive approach
Newstr = re.sub(r’is’,‘**’, Mystr, flags=re.IGNORECASE)

 

#Output:
Th** ** a sample **tring

Removing multiple characters using regex

The regex function can easily remove multiple characters from a string. Below is an example.

import re
Mystr = “This is a sample string”
pattern = r'[hsa]’
# Remove characters ‘h’, ‘s’ and ‘a’ from a string
Newstr = re.sub(pattern, , Mystr)

 

#Output:
Ti i mple tring

 

Replacing using join()

Another way to remove or replace characters is to iterate through the string and check them against some condition.

charList = [‘h’, ‘s’, ‘a’]
Mystr = “This is a sample string”
# Remove all characters in list, from the string
Newstr = .join((elem for elem in Mystr if elem not in charList))

 

#Output:
Ti i mple tring

Replacing using join() and filter()

Above example can also be done by using the filter function.

 

Mystr = “This is a sample string”
charList = [‘h’, ‘s’, ‘a’]
# Remove all characters in list, from the string
Newstr = “”.join(filter(lambda k: k not in charList , Mystr))

 

#Output:
Ti i mple trying

Must Read: Fascinating Python Applications in Real World

Replacing numbers

Many times the numerical data is also present in the strings that might need to be removed and processed separately as a different feature. Let’s go over a few examples to see how these can be implemented.

Using regex

Consider the below string from which we need to remove the numeric data.

Mystr = “Sample string9211 of year 20xx”
pattern = r'[0-9]’
# Match all digits in the string and replace them by empty string
Newstr = re.sub(pattern, “”, Mystr)

 

#Output:
Sample string of year xx

In the above code, we use the matching pattern r'[0-9]’ to match for all the digits. 

Using join() function

We can also iterate upon the string and filter out the digits using the isdigit() method which returns False for alphabets. 

 

Mystr = “Sample string9211 of year 20xx”
# Iterates over the chars in the string and joins all characters except digits
Newstr = “”.join((item for item in Mystr if not item.isdigit()))

 

#Output:
Sample string of year xx

Using join() and filter()

Similarly, we can also put the filtering condition in the filter function to only return the characters which return True.

 

Mystr = “Sample string9211 of year 20xx”

# Filter all the digits from characters in string & join remaining chars
Newstr = “”.join(filter(lambda item: not item.isdigit(), Mystr))

 

#Output:
Sample string of year xx

Before you go

We covered a lot of examples showing different ways to remove or replace characters/whitespaces/numbers from a string. We highly recommend you to try out more examples and different ways to do the above examples and also more examples of your own.

If you are curious to learn about python, data science, check out IIIT-B & upGrad’s PG Diploma in Data Science which is created for working professionals and offers 10+ case studies & projects, practical hands-on workshops, mentorship with industry experts, 1-on-1 with industry mentors, 400+ hours of learning and job assistance with top firms.

Prepare for a Career of the Future

UPGRAD AND IIIT-BANGALORE'S PG DIPLOMA IN DATA SCIENCE
Learn More

Leave a comment

Your email address will not be published.

×