Sling Academy
Home/Python/Python: Replace unwanted words in a string with asterisks

Python: Replace unwanted words in a string with asterisks

Last updated: June 03, 2023

When developing web and other kinds of applications with Python, there might be cases where you want (or have to) substitute bad words with asterisks (*), such as:

  • Censorship: In cases where you need to filter sensitive or inappropriate content, replacing unwanted words with asterisks can help mask or censor offensive language.
  • Data anonymization: When dealing with personally identifiable information (PII), replacing names or other identifying words with asterisks can help protect privacy.
  • Profanity filtering: In text analysis or content moderation systems, replacing offensive or inappropriate words with asterisks can be part of a profanity filtering mechanism.

This easy-to-understand, step-by-step tutorial will show you how to replace bad words in a string with asterisk characters by using regular expressions.

The steps

  1. Create a list of bad words that you want to hide from your text.
  2. Import the re module.
  3. Define a regular expression pattern to match each element in the list of undesired words.
  4. Use the re.sub() function to replace unwanted words with asterisks (the number of asterisks will be equal to the number of characters in substituted words). By adding the flags=re.IGNORECASE parameter to the re.sub() function, the regular expression will match the unwanted words regardless of their case.

Code example

import re

# Define a reusable function to replace unwanted words with asterisks
def replace_unwanted_words(text, unwanted_words):
    pattern = r'\b(?:' + '|'.join(map(re.escape, unwanted_words)) + r')\b'
    modified_text = re.sub(
        pattern, 
        lambda match: '*' * len(match.group()), 
        text, 
        flags=re.IGNORECASE)
    
    return modified_text


# Define a list of unwanted words
unwanted_words = ['stupid', 'horrible', 'ugly']

# Test the function
text1 = 'This is a stupid, horrible, and ugly text.'
print(replace_unwanted_words(text1, unwanted_words))

# Text with both upper and lower case letters
text2 = "This IS a STUPID, HORRIBLe, and UGLY text."
print(replace_unwanted_words(text2, unwanted_words))

Output:

This is a ******, ********, and **** text.
This IS a ******, ********, and **** text.

Pattern explained

If you’re not familiar with regular expressions in Python, you are likely to find it hard to understand the pattern used in the code snippet above. Let me explain it.

The regular expression pattern is constructed dynamically based on the unwanted_words input:

r'\b(?:' + '|'.join(map(re.escape, unwanted_words)) + r')\b'

Its components are:

  • r'\b(?:: The r at the beginning denotes a raw string literal. \b represents a word boundary to match whole words.
  • (?:' + '|'.join(map(re.escape, unwanted_words)) + r')\b: This part dynamically constructs a pattern by joining the unwanted_words using the | (OR) operator. re.escape is used to escape special characters in unwanted words to ensure they are treated literally. The (?: and ) create a non-capturing group.

See also: The modern Python regular expressions cheat sheet.

Next Article: Python: 3 ways to convert a string to a hexadecimal value

Previous Article: Python: Get a list of unique words/characters from a string

Series: Working with Strings in Python

Python

You May Also Like

  • Introduction to yfinance: Fetching Historical Stock Data in Python
  • Monitoring Volatility and Daily Averages Using cryptocompare
  • Advanced DOM Interactions: XPath and CSS Selectors in Playwright (Python)
  • Automating Strategy Updates and Version Control in freqtrade
  • Setting Up a freqtrade Dashboard for Real-Time Monitoring
  • Deploying freqtrade on a Cloud Server or Docker Environment
  • Optimizing Strategy Parameters with freqtrade’s Hyperopt
  • Risk Management: Setting Stop Loss, Trailing Stops, and ROI in freqtrade
  • Integrating freqtrade with TA-Lib and pandas-ta Indicators
  • Handling Multiple Pairs and Portfolios with freqtrade
  • Using freqtrade’s Backtesting and Hyperopt Modules
  • Developing Custom Trading Strategies for freqtrade
  • Debugging Common freqtrade Errors: Exchange Connectivity and More
  • Configuring freqtrade Bot Settings and Strategy Parameters
  • Installing freqtrade for Automated Crypto Trading in Python
  • Scaling cryptofeed for High-Frequency Trading Environments
  • Building a Real-Time Market Dashboard Using cryptofeed in Python
  • Customizing cryptofeed Callbacks for Advanced Market Insights
  • Integrating cryptofeed into Automated Trading Bots