Reader Level:
ARTICLE

Viewing and Writing XML Data using ADO.NET DataSets in VB.NET

Posted by Martin Kropp Articles | Visual Basic .NET November 10, 2012
This article presents step-by-step how to bind, read and view XML data and generate graphical reports with the build-in Crystal Reports engine. The article puts special emphasize on XML standardization and interoperability issues experienced during the project, and presents workarounds for existing limitations.
  • 0
  • 0
  • 7315

Abstract.

Based on a real world project this article shows how to handle XML data in .NET using VB.NET DataSets and DataViews. The article presents step-by-step how to bind, read and view XML data and generate graphical reports with the build-in Crystal Reports engine. The article puts special emphasize on XML standardization and interoperability issues experienced during the project, and presents workarounds for existing limitations.

Introduction.

This project was part of a larger project in the context of training acquisition and analysis. A Palm PDA is used for mobile data acquisition; these data are transferred onto a PC into a well-formed and valid XML data file. The task of this project was to develop a sophisticated data analysis and reporting application on the PC using the given XML data file.

The .NET framework was chosen for the PC application because of its great data handling capabilities, its powerful graphic functionality and the built-in reporting engine Crystal Reports. We selected the programming language VB.NET because we believe, that this is the language of the future under .NET - and because there is no major difference between the supported languages in VS.NET with respect to their offered functionality.

The idea was that both, the existing Palm application and the new VB.NET application can share one common XML file as its data source. 

XML Access Methods in .NET.

.NET offers numerous XML-related classes for reading and writing XML documents. Although the documentation states, that any class conforms to the W3C XML 1.0 standard, there are trade-offs in respect to performance, efficiency, productivity, and XML compliance between the different implementations.
So we first investigated into these different access methods and then selected the most appropriate for our application. This step also addresses important issues that must be considered when dealing with legacy valid and well-formed XML data in .NET. 

The VB.NET developer can use any of the following five methods to access XML data:

  • XmlTextReader
  • XmlValidatingReader
  • XmlDocument (and the rest of the DOM API),
  • XPathNavigator.
  • ADO.NET DataSets 

The detailed description of these methods would go beyond the scope of this article. Thus the following table summarizes very briefly the characteristics of each method and shows when to use which technology. 

 

XmlTextReader
  • You need only read access.
  • Performance is your highest priority.
  • You don't need XSD/DTD validation.
  • You don't need XSD type information at runtime.
  • You don't need XPath/XSLT services.
XmlValidatingReader
  • You need only read access.
  • You need XSD/DTD validation.
  • You need XSD type information at runtime.
XmlDocument
  • You need to update the document (read/write).
  • Productivity is your highest priority.
  • You need XPath services.
XPathNavigator
  • You need only read access.
  • You need to execute an XSLT transformation.
  • You want to leverage an implementation (like XPathDocument).
ADO.NET DataSets
  • You need to update the document (read/write).
  • You need greatest flexibility and functionality.
  • You want greatest tool support.
  • You need interop-support.

 

Table 1. XML Data Access Methods in .NET.

When dealing with complex XML Schemas, then ADO.NET DataSet offers greatest support and flexibility. The DataSet, which can be used either relational databases or XML data files, is an in-memory cache of retrieved data. It is the major component for data handling in the ADO.NET architecture.

The DataSet provides a consistent relational programming model regardless of the data source; thus it consists of a collection of DataTable objects that you can relate to each other with DataRelation objects. A DataSet reads and writes data and schema as XML documents. The data and schema can be transported across HTTP and used by any application, on any platform that is XML-enabled.

Considering all these features and the fact that this VB.NET application will need to be WebService interoperable with a server application in future we decided to go for DataSet's

XML Conformance in .NET.

Going with DataSets, ReadXML and ReadXMLSchema are the methods of choice to read the XML data, and the corresponding XML schema.

However doing so with the given valid and well-formed XML data file, raised the following runtime error: 

"The same table {description} cannot be the child table in two nested relations".

Looking deeper into the file shows that the valid and well-formed (and thus fully W3C compliant) XML file had a child table called "description" which had more than one parent table. This part of the XML Schema is shown in Figure 1 . 

ViewWr4.gif

Figure 1. Original XML Schema.

This is a valid and allowed design according to the W3C XML schema specification. (Refer: W3C definition of Parent child Relationship). Also the XMLSpy tool validated this XML schema as valid and well-formed. 

More investigation showed that the .NET XML framework does not (yet?) support the full XML 1.0 standard. It does not support recursive schemas or designs where a child node type is beneath more than one parent node type. Of course, also all classes based on DataSets, like the System.Windows.Forms.DataGrid, which we were going to use, have the same restriction. In other words: 

XML Design Issue 1:  ".NET Datasets require that a table can only be a child of one other table.

Microsoft has identified this as a bug (Q325695) and their knowledge base article [MS-Q325695] gives more information.
There are essentially three ways to solve this problem:

  • Write custom functions 
  • Write a converter 
  • Change your XML Schema 

If you can't break change the existing XML Schema (for example because you are not the owner of the file), there are two ways to solve the problem: you customize all your ode, or you write a converter.

Customizing all your code means you write special methods for the XmlReader class and add records to a table with a hierarchical type design. You also have to implement a custom filtering mechanism if you want to display this data in bound controls, and a custom writing mechanism to write to the type of hierarchical format you are reading from. Even worse, any change in the schema file will cause changes in all of your

custom functions - so this isn't really an.

The second option, if you can't change the given XML Schema, is to write a converter. That means you define a new .NET compliant XML Schema for the given XML file and make use of the given DataSet capabilities. Then you write a bi-directional converter to convert the data from one XML Schema into the other before you use it. Although this still causes additional effort, it causes the least additional effort.

If you are the owner of the XML Schema file, then solution is to change the given the given XML Schema and make it also .NET compliant. In our situation, this meant, to define three different description type structures (Idescription, Ddecscription, Fdescription), for the three occurrences of the description.

To avoid multiple definitions of the same structure (which can cause maintenance errors in the future), we made use of the inheritance capabilities of XML Schema. We derived the three description types from the still existing base structure "description". The details are shown in figure 2. 

  ViewWr5.gif

Figure 2. Avoid one child with several parents.

Reading XML Data with DataSets.

VB.NET developers have several approaches of using a DataSet in the applications. This section explains the chosen approach to create a DataSet from the modified XML Schema, the tools that were used and how we populated the DataSet from the XML source file.

A DataSet can be typed or untyped. A typed DataSet is a dataset, which is created based a given XML Schema definition file (XSD file). Information from the schema (tables, columns, etc.) is generated and compiled into the new DataSet class as a set of first-class objects and properties.

Because a typed DataSet class inherits from the base DataSet class, the typed class assumes all of the functionality of the DataSet class and can be used with methods that take an instance of a DataSet class as a parameter.

An untyped DataSet, in contrast, has no corresponding built-in schema. As in a typed dataset, an untyped DataSet contains tables, columns - but those are exposed only as collections. (However, after manually creating the tables and other data elements in an untyped DataSet, you can export the DataSet's structure as a schema using the DataSet 's WriteXmlSchema method.).

You can use either type of DataSet in your application. However, Visual Studio has more tool support for typed DataSets, and they make programming with the DataSet much easier and less error-prone. So having considered all these options the decision was to go with ADO.NET typed DataSets.

Typed Datasets can be generated with the XSD.EXE tool, which is part of the VS.NET environment. The tool accepts a valid XML Schema file as input, as well as the language to use (C#, VB).  The following line shows a typical command line of the tool which uses the XML Schema file XSDSchemaFileName.xsd. 

xsd.exe /d /l:VB.NET XSDSchemaFileName.xsd /n:XSDSchema.Namespace.

The /d directive tells the tool to generate DataSets, /l specifies the language to use, the optional /n defines the namespace to generate. The generated DataSet classes will be saved in the source file XSDSchemaFileName.vb.

Once the Typed DataSet classes are generated, the further procedure is almost a child's play. The provided methods and properties guarantee data access in a type safe manner.

So the next step was to populate the Typed DataSet at runtime from the XML file. The ReadXml() and the WriteXml() methods of the typed DataSet class do this very easily  without any difficulty as the following to lines of code show: 

Dim myDS As New DataSet.
myDS.ReadXml("input.xml", XmlReadMode.ReadSchema).

Viewing XML Data.

Having created and populated the DataSet the next step was to bind the data to the windows controls for user interaction. Since we were going to implement a grid view, this involved binding DataTables to DataGrids with user navigation facilities and providing parent-child relationships in the grid view so when a user selects a parent row, the corresponding child rows are to be shown automatically for editing purposes.

The DataTable is the primary building block of ADO.NET. The DataTable is made up of a DataColumn and DataRow collections. The columns define the schema of the DataTable and the rows make up the actual data in the DataTable. A DataView is a bindable, customized view of a DataTable. You can create multiple DataView's of the same DataTable, each one can contain different data sorted in different order. Additionally, you can add, delete, or edit information in each DataView.

DataTable and DataView use the same instances of DataColumns, i.e. they share the common structure. However DataTable and DataView each have its own row collections. The DataTable is consists of DataRow's while the DataView is made up DataRowView's. Figure 3 shows the relationship between DataSet, DataTable and DataView.

  ViewWr6.jpg

Figure 3. Tables and Views.

The data binding of typed DataSets is very smooth in VB.NET. You can bind the data sources to the WindowsForms controls at design time or runtime without any difficulty. For example, you can specify which data column to be bound to a ComboBox  in a WindowsForm  by setting  the DataSource and DataMember property of that control so that it gets automatically bound at runtime.

We used the .NET DataView class to bind DataTables to the DataGrid. DataView supports data filtering and data sorting at design time or at runtime. For convenience a default DataView is generated for the developer for customization.

In the analysis part, the application needs to display some statistical information, which is calculated from several columns and which is shown in some dynamically generated extra columns as shown in Figure 4 .

  ViewWr7.jpg

Figure 4. The Data View.

Some of the data columns needed for these calculations were in different parent and child tables. Unfortunately it was not possible join two tables in the dataset  to show up in a single DataGrid.(for example, by using the DataSet.Relations property).To further clarify the situation  say I have two Parent & child tables Table1(id,columnA,columnB) and Table2(id,columnC,columnD)  and I want resulting datagrid columns to be view like (columnA , columnB , columnC , columnD). It is similar to the situation if I had wrote a sql query like

select a.id , a.columnA , a.columnB , b.[columnC] , b.[columnD]  
From Table1 a , Table2 b 
where a.id =b.id.

It is important  to understand that DataView  is not equivalent of a SQL VIEW . A Dataset does not contain a query processor and has no mechanism to perform a join. 

The other way to do it  (most developers seems to suggest) is to simply create a new datatable

That is typed the way you want it and manually moved the data from both  datatables  to the new one. Also any data manipulation has to copy over to the original tables using a nested loops. This is a brute force method. It would be nice to here if anyone come up with something else.

Finally more references were added to the  schema file. But that is not the best workaround. The reason that the columns could not be merged is founded in the relational programming model of the DataSets. The existing XML file uses nested child tables instead of references. This is not (yet?) resolved by the .NET DataSets, so that references are missing to cross-reference tables and to establish relationships. 

XML Design Issue 2:  "Use references in XML data files instead of nested child tables to show up a joined table  in a single DataGrid".

So the next change was to introduce references where needed.

  ViewWr8.gif

Figure 5. Adding references to tables.

In our application we use the Date column as a filter criteria for the generated graphs in the Crystal Report Engine. Thus in Figure 5 a Date column has been added to the session table, which is a direct reference of the date column of the day table. 

Reporting XML Data.

Having completed the data binding to Windows forms the next task was to generate the summarized reports in both graphical and tabular forms. We have chosen the Crystal Reports Engine (which is delivered with the VS.NET environment) with its powerful dynamic reporting capabilities.

Crystal Reports for Visual Studio .NET offers the following three object models:

  • The Engine Object Model.
  • The Windows Forms Viewer Object Model.
  • The Web Forms Viewer Object Model.

The top level Engine object model is the report document object. The report document object offers sophisticated control over your object. It contains all the properties and methods needed to interface with and customize a report. How ever it does not provide the display of the reports.

This is where viewer objects and the report document objects come together. The Windows Forms viewer allows you to view a report in a Windows application and the Web Forms viewer does it for a Web application.

So we selected the Engine Object model together with the Windows Object Model to accomplish the requirements. The Vb.NET CrystalDecisions.Windows.Forms namespace in the CrystalDecisions.Windows.Forms.dll assembly provides support for the Windows Forms Viewer control and its associated classes. It can also dynamically update the report being hosted and interact with controls within a windows application.

When using the Crystal Report Engine you basically have two options for the selection of data source. The first one is passing the raw XML file, and the second is passing the populated DataSet to the Crystal Report Engine.

When using the first approach you have to verify on every usage, that the XML file contains correct data types (data validation). Otherwise Crystal Report Engine would interpret the data types in the XML files differently (e.g. float are used as strings), which prevents the data from being processed further within Crystal Reports.

The second, recommended option is passing the already populated typed DataSets to the Crystal Report Engine. This saves the runtime verification of data types. The content and the scope of the report is determined at runtime according to the user selection of the required report format. Crystal Reports provides template files for this. Figure 6 shows a sample report.

ViewWr9.jpg

Figure 6. Crystal Report View.

Summary.

Visual Studio .NET has a very rich set of tightly integrated tools that work together to support C# application developer. For XML based applications it provides schema generation and validation tools. It offers the methods of choice for XML handling. Especially the DataSet classes make reading and writing of XML data extremely easy. The DataSet classes provide easy data binding to the GUI and to even third-party tools like Crystal Reports for advanced reporting.

Other very nice new features are the new source code documentation support and application installer production. The code documentation tool generates a very nice HTML documentation (they improved the JavaDoc style), which is completely XML based. The integrated application installer allows easy creation of a setup program for your application.

Despite the currently existing limitations of DataSets concerning its XML standard support, it is the tool of choice for XML data handling with .NET.

Once your XML data sources confirm to the ".NET XML standard" the rest can be handled without much difficulty. It offers lot of built-in powerful classes and methods for reading, writing and displaying data. This allows the developer to concentrate on the more important application logic rather than having to worry about data handling issues or framework specifics.

References.

[MSVSNET] Visual Studio .NET Help, Microsoft Visual Studio IDE v7.0.

[W3CXML] Extensible Markup Language (XML) 1.0, 2nd Ed., http://www.w3.org/TR/2000/REC-xml-20001006, Oct 2000.

 [W3CDOM] Document Object Model Technical Reports http://www.w3.org/DOM/DOMTR.

[W3CCHI] W3C definition of Parent child Relationship, http://www.w3.org/TR/2000/sec-well-formed.

[MS-Q325695] "Error Message When You View Multi-Dimensional XSD Schemas in the Visual Studio .NET XML Schema Designer" http://support.microsoft.com/default.aspx?scid=kb;en-us;Q325695.

COMMENT USING

Trending up