When developing web applications, there might be an enormous number of times when we need to remove HTML tags from a string, such as:
- When working with content that will be used in an email or other message format where HTML tags are not allowed or not desired.
- When displaying user-generated content on a web page and we want to ensure that no HTML tags are included in the output.
- When cleaning up user input in web forms, where we want to remove any HTML tags that may have been included in the input to prevent cross-site scripting (XSS) attacks.
This article will show you 2 different approaches to deleting HTML tags from a string in JavaScript.
Using regular expressions
You can use a regular expression that matches any text between angle brackets (<>) and replaces it with an empty string by using the replace() method.
Example:
const stringWithTags =
"<p>Welcome to <strong>Sling Academy</strong>. Link to the homepage <a href='https://www.slingacademy.com'>here</a></p>";
const stringWithoutTags = stringWithTags.replace(/<[^>]*>/g, '');
console.log(stringWithoutTags);
Output:
Welcome to Sling Academy. Link to the homepage here
Using a DOM parser
We can create a temporary DOM element with the string set as its innerHTML property. The text content of the element is then extracted, which removes all HTML tags from the string.
Example:
const stringWithTags =
"<p>Welcome to <strong>Sling Academy</strong>. Link to the homepage <a href='https://www.slingacademy.com'>here</a></p>";
const tempElement = document.createElement('div');
tempElement.innerHTML = stringWithTags;
const stringWithoutTags =
tempElement.textContent || tempElement.innerText || '';
console.log(stringWithoutTags);
Output:
Welcome to Sling Academy. Link to the homepage here
This approach gives the same result as the previous one, but it’s harder to make mistakes because we don’t have to write a regular expression pattern.