How to replace a sub-string in Python
Posted in Python by Dirk - last update: Dec 20, 2023
In Python, string replacement is done using the str.replace()
method. For more complex cases, you can have the power of Regular Expression using re.sub()
. A specific case is replacing strings through string formatting - in this case you don’t really replace a sub-string, but provide a placeholder for the sub-string that will be updated when running the code. Here’s an overview of all methods:
Using str.replace()
The str.replace()
method is used to replace occurrences of a sub-string with another sub-string in a given string.
Syntax:
new_string = original_string.replace(old_substring, new_substring, count)
original_string
: The original string in which replacement will occur.
old_substring
: The sub-string to be replaced.
new_substring
: The sub-string that will replace the old sub string.
count (optional)
: Specifies the maximum number of occurrences to replace. If not specified, all occurrences are replaced.
Examples:
Simple replacement:
sentence = "I like ice cream."
new_sentence = sentence.replace("ice cream", "chocolate")
print(new_sentence)
Output:
Replacement with count
Here we specify the number of times the string must be replace. The word ‘ice cream’ appears 3 times in the sentence, but is only replaced 2 times, the last appearance remains unchanged
sentence = "I like ice cream and ice cream is delicious. I really like ice cream."
new_sentence = sentence.replace("ice cream", "chocolate", 2)
print(new_sentence)
Output:
I like chocolate and chocolate is delicious. I really like ice cream.
Using re.sub()
Replacement of sub-strings can also be done using regular expressions. In Python this is typically done using the re module, which provides support for regular expressions. The re.sub()
function is commonly used for replacing sub-strings that match a specified pattern with a replacement string. It’s more powerful and flexible than the previous method, as you can use Regex. Regular expressions provide a powerful way to perform complex string manipulations based on patterns, making them useful for tasks like data cleaning, parsing, and transformation. However, it’s important to be cautious when using regular expressions, especially for complex patterns, as they can be error-prone and difficult to maintain.
The re.sub()
function is used for replacing occurrences of a pattern in a string with a specified replacement.
Syntax
replaced_string = re.sub(pattern, replacement, input_string, count=0, flags=0)
pattern
: The regular expression pattern to search for in the input string.
replacement
: The string to replace the matched pattern with.
input_string
: The original string in which replacement will occur.
count (optional)
: Specifies the maximum number of occurrences to replace. If not specified, all occurrences are replaced.
flags (optional)
: Flags that modify the behavior of the regular expression. Common flags include re.IGNORECASE
, re.MULTILINE
, etc.
Example:
import re
text = "The price of the item is $20.99, but it's on sale for $15.49."
# Replace dollar amounts with "[PRICE]"
updated_text = re.sub(r'\$\d+(\.\d{2})?', '[PRICE]', text)
print(updated_text)
# Output: "The price of the item is [PRICE], but it's on sale for [PRICE]."
In this example, the regular expression r'\\$\d+(\\.\d{2})?
’ is used to match dollar amounts in the form of $XX.XX or $XX. The \$\d+
part matches the dollar sign followed by one or more digits, and (\\.\d{2})?
is an optional group that matches a dot followed by exactly two digits. The replacement string [PRICE]
is used to replace the matched patterns.
String formatting allows you to create a new string by inserting values into a template string. It is not exactly the same use case as before, where you replace a substring from an existing string. It’s more like a template, where you insert a variable in a string that can take different values.
String formatting exists in 3 versions -
- Original version (oldest)
new_string = "template %s" % value
Example
name = "John"
greeting = "Hello, %s!" % name
print(greeting)
# Output: "Hello, John!"
- Newer Method -
str.format()
:
new_string = "template {}".format(value)
Example:
item = "book"
sentence = "I have a {}.".format(item)
print(sentence)
# Output: "I have a book."
- Latest Method -
f-strings
(Python 3.6+):
new_string = f"template {value}"
Example
subject = "Python"
sentence = f"I love {subject}."
print(sentence)
# Output: "I love Python."
References:
Other articles