When developing web applications, there might be an enormous number of times when we need to remove HTML tags from a string, such as:
- When working with content that will be used in an email or other message format where HTML tags are not allowed or not desired.
- When displaying user-generated content on a web page and we want to ensure that no HTML tags are included in the output.
- When cleaning up user input in web forms, where we want to remove any HTML tags that may have been included in the input to prevent cross-site scripting (XSS) attacks.
This article will show you 2 different approaches to deleting HTML tags from a string in JavaScript.
Using regular expressions
You can use a regular expression that matches any text between angle brackets (<>) and replaces it with an empty string by using the replace() method.
Example:
const stringWithTags =
"<p>Welcome to <strong>Sling Academy</strong>. Link to the homepage <a href='https://www.slingacademy.com'>here</a></p>";
const stringWithoutTags = stringWithTags.replace(/<[^>]*>/g, '');
console.log(stringWithoutTags);Output:
Welcome to Sling Academy. Link to the homepage hereUsing a DOM parser
We can create a temporary DOM element with the string set as its innerHTML property. The text content of the element is then extracted, which removes all HTML tags from the string.
Example:
const stringWithTags =
"<p>Welcome to <strong>Sling Academy</strong>. Link to the homepage <a href='https://www.slingacademy.com'>here</a></p>";
const tempElement = document.createElement('div');
tempElement.innerHTML = stringWithTags;
const stringWithoutTags =
tempElement.textContent || tempElement.innerText || '';
console.log(stringWithoutTags);Output:
Welcome to Sling Academy. Link to the homepage hereThis approach gives the same result as the previous one, but it’s harder to make mistakes because we don’t have to write a regular expression pattern.