HTML Parser In Xamarin.Android Using jsoup

Xamarin
Introduction

In this article, we will learn how to parse an HTML page using jsoup in Xamarin.Android, just like we used jsoup in Native Java Android development.

jsoup

jsoup is a Java library for working with real-world HTML. It provides a very convenient API for extracting and manipulating the data, using the best of DOM, CSS, and jQuery-like methods.

jsoup for Xamarin.Android

In this article, we will see the Xamarin.Android binding or port Java jsoup and its implementations with the application. You can find the binding DLL of jsoup from GitHub or its package from NuGet

Coding Part

I have split this article into 3 steps.

Step 1

Creating a new Xamarin.Android project.

Step 2

Setting up the plug-in for Xamarin.Android application.

Step 3

Implementing jsoup in Xamarin.Android application.

Step 1 - Creating new Xamarin.Android Projects

Create a new project by selecting New >> Project, and select Android App followed by a click on OK.

Xamarin

Step 2 - Setting up the plug-in for Xamarin.Android application

In this step, we will include the jsoup plug-in for Xamarin.Android Project. Open NuGet Package Manager against the project and do search for jsoup. Select the jsoup package from the list and click "Install" to add the library or paste the following in Package Manager Console to install the NuGet plugin.

Install-Package Jsoup -Version 1.0.0

Xamarin

Step 3 - Implementing jsoup in Xamarin.Android application

In this part, we will see how to implement Jsoup to parse an HTML page or link.

  • Open your MainActivity.cs and paste the following code.
    1. Document doc = Jsoup.Connect("https://androidmads.blogspot.in/").Get();  
    2. Element link = doc.Select("img").First();  
    3. link.AbsUrl("src"); 
  • Here, the document used to make the web page is given as a singles file or document.
  • Consist of Elements and TextNodes.
  • Document extended from Elements or Nodes and TextNodes are extended from Nodes.
  • In the above example, the page from https://androidmads.blogspot.in link is retrieved and assigned to Document.
  • Then the Image tagged Elements (<img />) are retrieved from the document and I have getting the first “img” tag using Select("img").First()”. By default, doc.Select("img")” will returns the list of element or Elements.
  • Then, I have separated or retrieved the source of the image tag element using “link.AbsUrl("src")”.
  • The Jsoup Connections needed to be run with separate thread. So, I had done the call with AsyncTask. You can find the full code below.

Full code of the MainActivity.cs

The following code shows how to implement jsoup in Xamarin.Android with AsyncTask.

  1. [Activity(Label = "JsoupSample", MainLauncher = true)]  
  2.     public class MainActivity : Activity  
  3.     {  
  4.         TextView textView;  
  5.         protected override void OnCreate(Bundle savedInstanceState)  
  6.         {  
  7.             try  
  8.             {  
  9.                 base.OnCreate(savedInstanceState);  
  10.   
  11.                 // Set our view from the "main" layout resource  
  12.                 SetContentView(Resource.Layout.Main);  
  13.   
  14.                 textView = FindViewById<TextView>(Resource.Id.HtmlTextView);  
  15.                 new JsoupServerCall(this).Execute();  
  16.             }  
  17.             catch (Exception ex)  
  18.             {  
  19.   
  20.             }  
  21.         }  
  22.   
  23.         private class JsoupServerCall : AsyncTask  
  24.         {  
  25.             MainActivity activity;  
  26.             public JsoupServerCall(MainActivity activity)  
  27.             {  
  28.                 this.activity = activity;  
  29.             }  
  30.             protected override Java.Lang.Object DoInBackground(params Java.Lang.Object[] @params)  
  31.             {  
  32.                 Document doc = Jsoup.Connect("https://androidmads.blogspot.in/").Get();  
  33.                 Element link = doc.Select("img").First();  
  34.                 return link.AbsUrl("src");  
  35.             }  
  36.   
  37.             protected override void OnPostExecute(Java.Lang.Object result)  
  38.             {  
  39.                 base.OnPostExecute(result);  
  40.                 activity.textView.Text = result + "";  
  41.             }  
  42.         }  
  43.     } 

Download Code

You can download the full source code from the top of the article or from Github.


Similar Articles