Convert Data From HTML To Plain Text In Power BI

Introduction

 
What should we do if we get an HTML tag in our data and we need only plain text? Confused? Don’t worry here is the solution. Today I am writing this article to solve this issue.
 
In Power BI we are able to transform our data using Power query editor. Below is the list of steps that you need to follow to remove HTML tags and get plain data.
 
Step 1
 
I have Excel data in the ‘InternetData’ table and column two i.e.’ Web’ has data in HTML form. Our task is getting the plain text of this column. See the below image.
 
Convert Data From HTML To Plain Text In Power BI
 
Step 2
 
Now I am going to import this Excel data into Power BI. To do so, click on ‘Get data’ on the home tab and select ‘Excel’ as the source. Provide the file path, select the file and click ‘Ok’.
 
Convert Data From HTML To Plain Text In Power BI
 
A new Navigator windows will be opened. Select the check box of the table and then click on ‘Load’.
 
Convert Data From HTML To Plain Text In Power BI
 
The table has been loaded successfully and is showing on the right side of the BI tool. You can view the data of this table using table visualization. You can also change the format of the table visualization to show your data interactive.
 
Convert Data From HTML To Plain Text In Power BI
 
Step 3
 
Now our task is to remove HTML tags from ‘Web’ column data. So we need to open the Power Query editor to do this task. Click on the ‘Transform data’ available on the ‘Home’ Tab.
 
Convert Data From HTML To Plain Text In Power BI
 
Convert Data From HTML To Plain Text In Power BI
 
Step 4
 
Now add a custom column in this table. Go to the ‘Add Column’ table and click on ‘Custom Column’. A new ‘Custom Column’ window will be opened.
 
Convert Data From HTML To Plain Text In Power BI
 
Provide the custom column name in the ‘New Column Name’ box and also provide the DAX formula for removing HTML tags. The formula is given below.
  1. = Html.Table([ColumnName] , {{"text",":root"}})  
Put this formula in the ‘Custom Column Formula’ box and then click on ‘OK’. A custom column has been added now.
 
Convert Data From HTML To Plain Text In Power BI
 
Convert Data From HTML To Plain Text In Power BI
 
Click on the ‘Plain_Web_Text’ extend button (highlighted in above image).
 
Convert Data From HTML To Plain Text In Power BI
 
Convert Data From HTML To Plain Text In Power BI
 
To save changes, go to the file tab and click on ‘Close & Apply’.
 
Convert Data From HTML To Plain Text In Power BI
 
A new column has been added and showing in Power BI.
 
Convert Data From HTML To Plain Text In Power BI
 

Summary

 
These are the simple steps to remove HTML tags from the data. I hope you enjoy the article and that it helps you to solve your problem. Follow these steps and get plain text from the HTML tag. Thanks for reading, and have a nice day.


Similar Articles