Data Scraping Using UiPath

Introduction to Data Scraping

Data scraping is the process of extracting data from web browser, document or application and store in any media like database, CSV, or spreadsheet. To do so the data must be organized and in structured pattern. For example, a web search of e-commerce website or a job portal. The name of the product, description and price can be extracted and store in an excel workbook. The excel can later be used for various market survey and analysis. Similarly, job search of certain title and name of the company can be used to generate lead for your business.

Data Scraping Steps

This article will be the walkthrough of data scraping of a job portal. The steps are pointed out below:

Step 1

Open a new process with a meaningful name in UiPath studio

Data Scraping using UiPath

Step 2

Open browser and navigate to indeed.com

Step 3

Search any job tile you prefer

N.B. You can do the above two steps manually or automate it by using Open Browser activity

Step 4

Select Data Scraping from design bar

Data Scraping using UiPath 

Step 5

Click Next

Graphical user interface, text, application, emailDescription automatically generated

Step 6

Indicate the Job Title of 1st job post

Data Scraping using UiPath

Step 7

As soon as you indicate the 1st job title a pop-up appears prompting to extract the whole table.

If the web page contains 1 table each, then selecting yes would work. For example, the given image

Data Scraping using UiPath

But our job portal contains around 10 job post in each page. In this scenario we will select no and move to the next step

Step 8

Click Next and indicate the 2nd job title of last job post to create the pattern for the bot to extract

Step 9

Configure the column name in this case “Job Title”. We can also extract the URL by simply checking the Extract URL box and naming the column

Data Scraping using UiPath

Step 10

The extracted data looks like this

Data Scraping using UiPath

To extract more data, we must select “Extract Correlated Data”. Let’s try extracting company name

Step 11

Similarly, select the company name of the 1st job post and similarly to create a pattern indicate the company name of the 2nd job post

Step 12

The extracted data will look like this. To complete the extraction, click Finish

Data Scraping using UiPath

Step 13

If the web page consist of multiple pages then to extract from all the page indicate the next button

Data Scraping using UiPath

Data Scraping using UiPath

We can now store the extracted data to an excel workbook. And our work will look something like this. By default the extracted datatable is saved in the variable ExtractDataTable

Data Scraping using UiPath

Data Scraping using UiPath

Conclusion

In this article we have learned to extract data from a web page. Try by yourself and extract data from any medium.


Similar Articles