Web Scraping Using Scrapy

What is Web Scraping?

 
Web Scraping is extracting a large amount of data from websites. This data then can be saved in your local file on the computer or in the database table.
 
We can use this data to do the analysis. For example, we can scrape prices of products from e-commerce websites and then analyze them. 
 
Why Web Scraping? 
 
Data displayed on the website can only be viewed on the browser. We don't get to save this information. For this, we actually need to copy/paste the entire website which is boring sometimes. So, instead, we can use scrapers to get the information in a fraction of minutes.
 
Scrapy Framework 
 
Scrapy is the web scraping framework written in Python. It can be used for various purposes like data mining, monitoring, and test automating. Scrapy is open source and available for python 2.7 and python 3.4 and above version.
 
Here, we will see how easily we can scrape websites using the Scrapy framework.
 
Steps 
  1. Requirement
     
    Install Python 2.7 or Python 3.4 or above. Here is the link for downloading python Python Link.
     
  2. Install Scrapy
     
    Open your command prompt or terminal and type,
     
    pip install scrapy
     
  3. Scrapy Shell
     
    Scrapy has a ScrapyShell which can be used for testing or debugging your code and you can also scrape the URLs from here. So, once you have successfully installed Scrapy, just write in your command prompt or Terminal -
     
    scrapy shell 

    Python
     
  4. Fetch
     
    Once ScrapyShell is started successfully, we can start scraping. Fetch is going to request the response and scrape the data. For now, I am going to take my friend's website "ugentertainment.in".
     
    fetch("http://ugentertainment.in/")
     
    Python
     
  5. View
     
    The view will open the response in your default browser.
     
    view(response)
     
    Python
     
    and the scraped website will open in the default browser and you can compare the original website and scraped website. 
Scaped Website
 
Python
 
Original Website 
 
Python
 
And you are done with scraping your first website using Scrapy.