August 26, 2021

How to Remove HTML Tags from String in Javascript

By Krishnaa JavaScript Questions 0 Comments

In this tutorial, you will learn how to remove HTML tags from string in javascript. Before I start this tutorial, I am assuming your goal is to remove all HTML tags present in a string and get only the plain text in return.

Removing HTML tags from a string is not easy since a string can contain multiple nested HTML tags. This means there is no standard way to remove HTML tags from a string.

There are plenty of solutions available online to solve this issue. Most of them involve the usage of regular expressions and if you pay close attention to those solutions, the format of the regular expression is not fixed. That is why it is not an ideal approach.

I believe the best approach in this scenario would be to use DOMParser() constructor because it can easily parse HTML or XML from a string. It returns a DOMParser object and this object contains the parseFromString() method. This method takes 2 parameters, a string and MIME type.

In the case of HTML string, it will give us an HTML document and because it is an HTML document, we can easily parse or manipulate its content by using standard DOM properties and methods.

In the following example, we have one global HTML string. I simply want to remove all HTML tags in it, grab text content and display the text content in the paragraph element. Please have a look over the code example and the steps given below.

HTML & CSS

We have 3 elements in the HTML file (div, p, and h1). The div element is just a wrapper for the rest of the elements.
The inner text for the button element is “Remove Tags”.
Both the paragraph elements are empty because we will populate them by javascript. We have given them unique ids p1 and p2.
We have done some basic styling using CSS and added the link to our style.css stylesheet inside the head element.
We have also included our javascript file script.js with a script tag at the bottom.

<!DOCTYPE html>

<head>

<title>Document</title>

</head>

<body>

<div>

<button>Remove Tags</button>

</div>

</body>

</html>

<!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <meta name="viewport" content="width=device-width, initial-scale=1.0"> <meta http-equiv="X-UA-Compatible" content="ie=edge"> <link rel="stylesheet" href="style.css"> <title>Document</title> </head> <body> <div> <button>Remove Tags</button> <p id="p1"></p> <p id="p2"></p> </div> <script src="script.js"></script> </body> </html>

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <meta http-equiv="X-UA-Compatible" content="ie=edge">
    <link rel="stylesheet" href="style.css">
    <title>Document</title>
</head>
<body>
    
    <div>        
        <button>Remove Tags</button>
        <p id="p1"></p>
        <p id="p2"></p>
    </div>
    
    <script src="script.js"></script>
</body>
</html>

div {

text-align: center;

}

button {

padding: 10px 20px;

}

p {

margin-top: 20px;

}

div { text-align: center; } button { padding: 10px 20px; } p { margin-top: 20px; }

div {
  text-align: center;
}

button {
  padding: 10px 20px;
}

p {
  margin-top: 20px;
}

Javascript

We have selected the button element using the document.querySelector() method and stored it in btnRemove variable.
In the same way, we have selected both paragraph elements by their ids and stored them in p1 and p2 variables.
We have one global variable htmlString which contains our HTML string.
We have attached the DOMContentLoaded event listener to the window and we are setting the innerHTML property of the first paragraph equal to our htmlString so that we can see how our HTML string looks with HTML tags when browser parses it.
We have attached the click event listener to the button element.
In the event handler function, we are using a DOMParser constructor. We are storing the DOMParser object in the parser variable.
Like I said above, it contains a method parseFromString(). We are passing it 2 parameters, htmlString, and MIME type which in our case is text/html. We are storing the document returned by this function in the doc variable.
It is time to extract text content from the document’s body and display it in the second paragraph. We are using the logical OR (||) operator to make sure the textContent property of the body element is not empty. If it is empty, we want to display “No Content”.

let btnRemove = document.querySelector('button');

let p1 = document.querySelector('#p1');

let p2 = document.querySelector('#p2');

let htmlString = `<p>Please <u>remove</u> all <strong>HTML</strong> tags <em>from this</em> string.`;

window.addEventListener('DOMContentLoaded', () => {

p1.innerHTML = htmlString;

})

btnRemove.addEventListener('click', () => {

let parser = new DOMParser();

let doc = parser.parseFromString(htmlString, "text/html");

p2.textContent = doc.body.textContent || "No Content";

});

let btnRemove = document.querySelector('button'); let p1 = document.querySelector('#p1'); let p2 = document.querySelector('#p2'); let htmlString = `<p>Please <u>remove</u> all <strong>HTML</strong> tags <em>from this</em> string.`; window.addEventListener('DOMContentLoaded', () => { p1.innerHTML = htmlString; }) btnRemove.addEventListener('click', () => { let parser = new DOMParser(); let doc = parser.parseFromString(htmlString, "text/html"); p2.textContent = doc.body.textContent || "No Content"; });

let btnRemove = document.querySelector('button');
let p1 = document.querySelector('#p1');
let p2 = document.querySelector('#p2');

let htmlString = `<p>Please <u>remove</u> all <strong>HTML</strong> tags <em>from this</em> string.`;


window.addEventListener('DOMContentLoaded', () => {
   p1.innerHTML = htmlString;
})

btnRemove.addEventListener('click', () => {
   let parser = new DOMParser();
   let doc = parser.parseFromString(htmlString, "text/html");
   p2.textContent = doc.body.textContent || "No Content";
});

You May Also Like: