In Focus

Validation Of Excel Data In C#

This article demonstrates how to validate an Excel sheet before uploading or storing its data into a database.

Introduction

This article demonstrates how to validate an Excel sheet before uploading or storing its data into a database.

I have seen many developers implementing various types of logic to achieve the same. Some developers fetch the Excel data and validate (data type, length of data etc.) each data one by one.

This is a time-consuming process if you have a large amount of data (more than 1000 records) in Excel. This article demonstrates how to validate an Excel data without consuming a lot of time.

Getting Started

Before starting a demonstration let’s understand about this demonstration. This demonstration is useful while validating an Excel sheet before uploading it to the server in both web and Windows-based applications. But here, this demonstration uses Windows-based application and it does not contain Upload coding part, it contains only validation part of coding.

It first,reads an Excel sheet's data using OLDB and stores its data into a Dataset, then validates the Excel sheet and displays validation message in error panel if the error exists.

It does two types of validation - one is table structure or schema and another one is data validation. Validation happens with the help of XmlReaderSettings class which comes under System.Xml namespace.

Validating schema means validating the structure of the table, the number of columns should be same as a schema of the Excel sheet. Let’s talk about schema.

Schema

It is the recommendation of the World Wide Web Consortium (W3C) that specifies how to formally describe the elements in an Extensible Markup Language (XML) document. It can be used by programmers to verify each piece of item content in a document. They can check if it adheres to the description of the element it is placed in.

  1. The purpose of an XML Schema is to define the legal building blocks of an XML document:
  2. the elements and attributes that can appear in a document
  3. the number of (and order of) child elements
  4. data types for elements and attributes
  5. default and fixed values for elements and attributes

For validating the schema of  Excel sheet, the ValidationType property of class XmlReaderSettings needs to be set ValidationType.Schema. The below code is sample example for validating the schema of Excel sheet.

  1. //XmlSchemaSet schemas;  
  2.            schemas = new XmlSchemaSet();  
  3.            schemas.Add("http://www.deitel.com/booklist", schema + "\\StudentsSchema.xsd");  
  4.   
  5.            settings = new XmlReaderSettings();  
  6.            settings.ProhibitDtd = false;  
  7.            settings.CheckCharacters = true;  
  8.            settings.ValidationType = ValidationType.DTD;  
  9.            settings.Schemas = schemas;  
  10.            settings.ValidationEventHandler += new ValidationEventHandler(settings_ValidationEventHandler);  
  11.            Uri uri = new Uri(schema + "\\StudentDTD.dtd");  
  12.            //settings.XmlResolver.ResolveUri(uri, null);  
  13.   
  14.            XmlReader reader = XmlReader.Create(schema + "\\Students.xml", settings);  
  15.            while (reader.Read()) ;  
  16.            if (valid)  
  17.            {  
  18.                writertbox("Document SCHEMA is valid");  
  19.            }  // end if  
  20.            valid = true;  
  21.            reader.Close();  

Same as above for validating, you need to set the ValidationType property to ValidationType.DTD. The below code is a sample example for validating the data of Excel sheet. Data validaton means validation of data, such as - type of data and length of data should exist etc.

  1. XmlReaderSettings settings = new XmlReaderSettings();  
  2.   
  3.             //settings.ProhibitDtd = false;  
  4.             settings.DtdProcessing = DtdProcessing.Parse;  
  5.             settings.ValidationType = ValidationType.DTD;  
  6.   
  7.             settings.ValidationEventHandler += new ValidationEventHandler(ValidationCallBack);  
  8.             XmlReader reader = XmlReader.Create(Path + "\\Students.xml", settings);  
  9.   
  10.   
  11.   
  12.             // Parse the file.  
  13.   
  14.             try  
  15.             {  
  16.   
  17.                 while (reader.Read()) ;  
  18.   
  19.                 if (valid)  
  20.                 {  
  21.                     writertbox("Document contains valid data");  
  22.                 }  // end if  
  23.             }  
  24.             catch  
  25.             {  
  26.   
  27.   
  28.             }  

Demonstration

Open whatever version  of Visual Studio you have, but the framework should be minimum 3.0 and VS2010 for smoothly running this demonstration.

Select WPF as a project template, then give the name of the project as “Validate Excel”.

Take controls like textbox, textblock, button, and list box like below image. The textbox and button are taken in this demonstration to browse the Excel file and list box whose border color is red for displaying the messages.



Create a function named ‘ReadExcelFile’ and use the below code to read the Excel data.
  1. private DataSet ReadExcelFile(string excelPath) {  
  2.     DataSet ds = new DataSet();  
  3.     string connectionString = GetConnectionString(excelPath);  
  4.     using(OleDbConnection conn = new OleDbConnection(connectionString)) {  
  5.         conn.Open();  
  6.         OleDbCommand cmd = new OleDbCommand();  
  7.         cmd.Connection = conn;  
  8.         // Get all Sheets in Excel File  
  9.         // DataTable dtSheet = conn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);  
  10.         // Loop through all Sheets to get data  
  11.         // foreach (DataRow dr in dtSheet.Rows[2])  
  12.         // {  
  13.         string sheetName = "Student$";  
  14.         //if (!sheetName.EndsWith("$"))  
  15.         // continue;  
  16.         // Get all rows from the Sheet  
  17.         cmd.CommandText = "SELECT * FROM [" + sheetName + "]";  
  18.         DataTable dt = new DataTable();  
  19.         dt.TableName = sheetName.Replace("$", string.Empty);  
  20.         OleDbDataAdapter da = new OleDbDataAdapter(cmd);  
  21.         da.Fill(dt);  
  22.         ds.Tables.Add(dt);  
  23.         //}  
  24.         //string schema = ds.GetXmlSchema();  
  25.         cmd = null;  
  26.         conn.Close();  
  27.     }  
  28.     return ds;  
  29. }  
  30. private string GetConnectionString(string excelPath) {  
  31.     Dictionary < string, string > props = new Dictionary < string, string > ();  
  32.     // XLSX - Excel 2007, 2010, 2012, 2013  
  33.     props["Provider"] = "Microsoft.ACE.OLEDB.12.0;";  
  34.     props["Extended Properties"] = "Excel 12.0 XML";  
  35.     props["Data Source"] = excelPath;  
  36.     // XLS - Excel 2003 and Older  
  37.     //props["Provider"] = "Microsoft.Jet.OLEDB.4.0";  
  38.     //props["Extended Properties"] = "Excel 8.0";  
  39.     //props["Data Source"] = "C:\\MyExcel.xls";  
  40.     StringBuilder sb = new StringBuilder();  
  41.     foreach(KeyValuePair < string, string > prop in props) {  
  42.         sb.Append(prop.Key);  
  43.         sb.Append('=');  
  44.         sb.Append(prop.Value);  
  45.         sb.Append(';');  
  46.     }  
  47.     return sb.ToString();  
  48. }  

Create another function with name ‘CreateXML’ to convert the Excel data into XML file for validating the data. Use the below code to create an XML file.

  1. private void CreateXML(string path,DataSet ds)  
  2.         {  
  3.               StringBuilder sb = new StringBuilder();  
  4.                 sb.Append("<?xml version='1.0' standalone='no'?>");  
  5.                 sb.Append(Environment.NewLine);  
  6.                 sb.Append("<!DOCTYPE NewDataSet SYSTEM '"+path+"\\StudentDTD1.dtd'>");  
  7.                 sb.Append(Environment.NewLine);  
  8.                 sb.Append(ds.GetXml());  
  9.               string ab=sb.ToString();  
  10.               XmlDocument doc = new XmlDocument();  
  11.               doc.LoadXml(ab);  
  12.               doc.Save(path + "\\Students.xml");   
  13.         }  

The above functions are related to the Excel sheet. Now, we will create an XML schema file. In this demonstration, I have created a schema file with the help of DataSet’s WriteXMLSchema function. You can generate in the same way or you can create manually.

Below schema is for validation structure of demonstrational Excel sheet.

  1. <?xml version="1.0" encoding="utf-8"?>  
  2. <xs:schema id="NewDataSet" xmlns="" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:msdata="urn:schemas-microsoft-com:xml-msdata">  
  3.   <xs:element name="NewDataSet" msdata:IsDataSet="true" msdata:UseCurrentLocale="true">  
  4.     <xs:complexType>  
  5.       <xs:choice minOccurs="0" maxOccurs="unbounded">  
  6.         <xs:element name="Student">  
  7.           <xs:complexType>  
  8.             <xs:sequence>  
  9.               <xs:element name="RollNo" type="xs:double" minOccurs="0" />  
  10.               <xs:element name="StudentName" type="xs:string" minOccurs="0" />  
  11.               <xs:element name="DOB" type="xs:dateTime" minOccurs="0" />  
  12.               <xs:element name="Class" type="xs:string" minOccurs="0" />  
  13.             </xs:sequence>  
  14.           </xs:complexType>  
  15.         </xs:element>  
  16.       </xs:choice>  
  17.     </xs:complexType>  
  18.   </xs:element>  
  19. </xs:schema>  
Given below is the DTD for validating the data of Excel sheet.
  1. <!ELEMENT NewDataSet (Student)*>   
  2. <!ELEMENT Student (RollNo,StudentName, DOB, Class)>  
  3. <!ELEMENT RollNo (#PCDATA)>  
  4. <!ELEMENT StudentName (#PCDATA)>  
  5. <!ELEMENT DOB (#PCDATA)>  
  6. <!ELEMENT Class (#PCDATA)>  
Create two more functions with the name ‘ValidateExcelSchema’ and ‘ValidateExcelData’ for writing the codes for validating schema and data of the Excel sheet. I have already written in the upper part of this article for validating the schema and data. Use the above code for the same or see the below code.
  1. private bool ValidateExcelSchema(string schema)  
  2.        {  
  3.   
  4.            //XmlSchemaSet schemas;  
  5.            schemas = new XmlSchemaSet();  
  6.            schemas.Add("http://www.deitel.com/booklist", schema + "\\StudentsSchema.xsd");  
  7.   
  8.            settings = new XmlReaderSettings();  
  9.            settings.ProhibitDtd = false;  
  10.            settings.CheckCharacters = true;  
  11.            settings.ValidationType = ValidationType.DTD;  
  12.            settings.Schemas = schemas;  
  13.            settings.ValidationEventHandler += new ValidationEventHandler(ValidationCallBack);  
  14.            Uri uri = new Uri(schema + "\\StudentDTD.dtd");  
  15.            //settings.XmlResolver.ResolveUri(uri, null);  
  16.   
  17.            XmlReader reader = XmlReader.Create(schema + "\\Students.xml", settings);  
  18.            while (reader.Read()) ;  
  19.            if (valid)  
  20.            {  
  21.                writertbox("Document SCHEMA is valid");  
  22.            }  // end if  
  23.            valid = true;  
  24.            reader.Close();  
  25.   
  26.            return false;  
  27.        }  
  28.        private void ValidateExcelData(string Path)  
  29.        {  
  30.            XmlReaderSettings settings = new XmlReaderSettings();  
  31.   
  32.            //settings.ProhibitDtd = false;  
  33.            settings.DtdProcessing = DtdProcessing.Parse;  
  34.            settings.ValidationType = ValidationType.DTD;  
  35.   
  36.            settings.ValidationEventHandler += new ValidationEventHandler(ValidationCallBack);  
  37.            XmlReader reader = XmlReader.Create(Path + "\\Students.xml", settings);  
  38.   
  39.   
  40.   
  41.            // Parse the file.  
  42.   
  43.            try  
  44.            {  
  45.   
  46.                while (reader.Read()) ;  
  47.   
  48.                if (valid)  
  49.                {  
  50.                    writertbox("Document contains valid data");  
  51.                }  // end if  
  52.            }  
  53.            catch  
  54.            {  
  55.   
  56.   
  57.            }  
  58.        }  
  59.        private void ValidationCallBack(Object sender, ValidationEventArgs args)  
  60.        {  
  61.            //Display the validation error.  This is only called on error  
  62.            //bool m_Success = false; //Validation failed  
  63.            valid = false; // validation failed  
  64.            writertbox("Validation error: " + args.Message);  
  65.        }  

I have attached the Excel sheet which is taken in this demonstration and the above details demonstrated how to create a project in VS for validating the Excel sheet. Now, we will discuss the steps to validate an Excel sheet.

First, we will validate the schema or table structure of Excel, remove or add one table of attached Excel sheet, and browse in your demo application.

You will get an error message regarding the schema like on the below image.

 

For validation of the data set of the number of columns of an Excel sheet according to the schema give alphabet or string values in the cell of the of roll number column and browse the Excel sheet.

 
If now there is an error in the Excel sheet, you will get a positive message like the below image.