Integrate Through Web Interfaces with C#

Introduction

In today's enterprise environments, there are many applications in a company, big or small. Quite often some of these applications need work together. That's what integration comes to play. People often talk about Web Service, CORBA, COM, etc. these fancy things to make the middleware part. For some these things might be needed, for others they might be resolved through the idea proposed in this article: integrate through Web interfaces. The latter solution may be proved to be universally applied, low cost, time saving and little responsibility for the applications to be integrated (this is the most headache problem within an enterprise).

In the past several years, most enterprise applications have been web enabled, which means they have web interfaces for users and almost surely for administrators. For instance, some old mainframe applications have done so through WebSphere from IBM. Newer applications may be impossible to have no an web interface in this World Wide Web era. More often many of these applications do not provide the APIs or the APIs are hard for others to use because of the platform, programming languages, etc. If the project you work on asks help from other projects, others may have no time or may not be willing to help or even they have no way (the system is from third party) to help. This is where the integrating through web interfaces come to rescue. The author has been a technical lead for a project at AT&T Wireless Service, Inc. in the past two years, and found this is a way of fast, low cost and little help from third party.

An Example

Considering an example the author has been working on in the past two years: A legal division needs the Billing information, service account information, call details from several 2G and 3G systems, some are on mainframes and others are relational databases for requests from some legal agencies. Originally, people will have to login on six systems (more in the future). Many times they have to scrape the data on screens and eventually put it together. It is a painful and tedious process. That is why a new integrated system which can handle requests in a uniform way is utterly needed.

The new system requires all the information be collected together to generate the final reports in a single request from users. For the systems involved, each of them has its own presentation layer. Directly accessing these systems will require a lot responsibilities (creating APIs and provide detail original architectures) from other division and business logic are even hard to battle with. Fortunately, most of these systems have been web enabled and using the web interfaces to integrate these systems become a necessary (and probably easiest) way to go.

Where We Start

For the concrete technical details, this article only talks the essential part using C# to do the integration work. The author also used Java technologies to do the same thing for some cases with less satisfied results, but he would like to talk it in other place since this forum is not focused on Java technologies.

Many may wonder to use the System.Web package in .Net. The author has tried but found it is probably not feasible since it is hard to get the tablized data and handle the scripting in the html pages (but the author hope readers can explore to see if the idea of this article can be realized with this package. If so, it would be a greater improvement of this article). What the author used are microsoft.mshtml.dll and interop.SHDocVw.dll. Their documentation can be found at MSDN web site.

The following code for doing the integration is to create an instance of IE and programmingly interact with the instance: submit a query and retrieve data. In order to interact with different systems, the connection should go to different url and handle the data differently from different systems. Here we only provide the first conceptual step.

public static void OpenBrowser(string url)
{
object o = null;
InternetExplorer ie =
null;
try
{
ie =
new InternetExplorerClass();
IWebBrowserApp wb = (IWebBrowserApp) ie;
wb.Navigate(url,
ref o, ref o, ref o, ref o);
wb.Visible =
true;
// there are should be some better way to tell if it is ready.
while (wb.Busy);
HTMLDocument wd = (HTMLDocument)wb.Document;
HTMLTable myTable;
Object input = wd.all.item("para1",0);
if (input == null )
Console.WriteLine("no para1");
HTMLInputElement e = (HTMLInputElement) input;
e.
value = " the test data submitted";
HTMLInputElement e2 = (HTMLInputElement)wd.all.item("Submit",0);
mshtml.HTMLFormElement he = (HTMLFormElement)wd.forms.item("mytest", 0);
he.submit();
......
}
finally
{
if (ie != null)
ie.Quit();
}
}

The following code shows how to retrieve the data in a table, which is used almost in any presentation. One may retrieve any other data in the similar way, just consult the documentation. Please note that there is a trick to cast the data object in a cell to IHTMLElement type, so the data can be easily retrieved. 

myTable = (HTMLTable)wd.getElementById("tid");
IHTMLElementCollection ic = wd.getElementsByTagName("table");
System.Collections.IEnumerator ice = ic.GetEnumerator();
ice.MoveNext();
HTMLTable theSameTable = (HTMLTable)ice.Current;
// two ways to get the same table.
Console.WriteLine("There are two ways to get a table:");
Console.WriteLine(" Are tables myTable and theSameTable the same, myTable == theSameTable? " + (myTable==theSameTable) + ".\n");
//Print the table data to console.
Console.WriteLine("Here is the data from a table: it will be used by your application.");
System.Collections.IEnumerator ee = myTable.rows.GetEnumerator();
while(ee.MoveNext())
{
HTMLTableRow row = (HTMLTableRow) ee.Current;
IHTMLElementCollection cells = row.cells;
System.Collections.IEnumerator ee2 = cells.GetEnumerator();
while (ee2.MoveNext())
{
IHTMLTableCell aCell = (IHTMLTableCell) ee2.Current;
IHTMLElement el = (IHTMLElement)aCell;
Console.WriteLine( "Cell=" + row.rowIndex + "," + aCell.cellIndex + "..."+ el.innerText);
}
}

One additional nice thing is that all session states are handled by the IE instance. As long as the Internet Browser is configured properly, one does not worry about proxy, https, etc.

The actual integration is, of course, much more complicated because the inter-dependency of the systems. The data has been massaged: remove unnecessary information, format the data and arrange in proper order. But these things are beyond the scope of this article. Here only the first step making the integration possible is provided.

Summary

In summary, this article proposes a way for integrating the enterprise applications through web interfaces. Since the web interfaces are widely available for many existing applications, the method may well reduce the cost of building an integration system with less time and get ride of the dependency on other resources.