Linq To Excel In Action

Introduction

In this post, we will demonstrate how we will use LinqToExcel library to query data from Excel.

Previously, I didn’t have any idea about creating queries using LINQ from excel spreadsheets, but when I searched, I found LinqToExcel library which is available at paulyoder GitHub repository. This library, as we will see later, presents so many mechanisms that enable us to use LINQ in order to build the queries and get the data from Excel spreadsheets.

I'd like to point out that LinqToExcel library is available in both a 32bits and 64 bits. You can select one of them that matches with your application settings. I hope you will like it.

LinqToExcel library

In this post, we are going to,

  • Create Console application.
  • Install LinqToExcel Packages.
  • Create ConnexionExcel class.
  • Queries with LinqToExcel.

Create Console application

Open Visual Studio and select File >> New Project.

The "New Project" window will pop up. Select Console App (.NET Framework), name your project, and click OK.

LinqToExcel library

Install LinqToExcel Packages

At this level, we have 3 possibilities to install Linq to Excel packages.

  • Using NuGet Manager
    In Solution Explorer, right-click on References >> Manage NuGet Packages.

    LinqToExcel library

    Next, type LinqToExcel in search text box, select the first line as shown above, and click on "Install" button.
  • Using Package Manager Console
    Enter the following command in the package manager console.

    LinqToExcel library
  • The last solution is to add LinqToExcel references directly in your project. You can download them from the following here.

Let’s start discovering some features of LinqToExcel library.

Before of all, create a new Excel file in your local machine, and name it as you like. 

Next, you can add the following data rows into the first sheet as shared below.

LinqToExcel library

Create ConnexionExcel class

Here, we created ConnexionExcel class which contains two properties.

  • PathExcelFile: contains location path of excel file.
  • UrlConnexion: gets an instance of ExcelQueryFactory.

  1. public class ConnexionExcel {  
  2.     public string _pathExcelFile;  
  3.     public ExcelQueryFactory _urlConnexion;  
  4.     public ConnexionExcel(string path) {  
  5.         this._pathExcelFile = path;  
  6.         this._urlConnexion = new ExcelQueryFactory(_pathExcelFile);  
  7.     }  
  8.     public string PathExcelFile {  
  9.         get {  
  10.             return _pathExcelFile;  
  11.         }  
  12.     }  
  13.     public ExcelQueryFactory UrlConnexion {  
  14.         get {  
  15.             return _urlConnexion;  
  16.         }  
  17.     }  
  18. }  

 

Query a worksheet with a header row

We are creating a query that returns data from the worksheet by using header row. We should be using a class which has the following properties that match with column names of the worksheet. The compiler expects the worksheet must be named “Sheet1”.

So let’s create product class with the following properties.

  1. public class Product {  
  2.     public int ProductId {  
  3.         get;  
  4.         set;  
  5.     }  
  6.     public string ProductName {  
  7.         get;  
  8.         set;  
  9.     }  
  10.     public string CategoryName {  
  11.         get;  
  12.         set;  
  13.     }  
  14. }  

Query 1

  1. string pathToExcelFile = @ "D:\ExcelData.xlsx";  
  2. ConnexionExcel ConxObject = new ConnexionExcel(pathToExcelFile);  
  3. //Query a worksheet with a header row  
  4. var query1 = from a in ConxObject.UrlConnexion.Worksheet < Product > ()  
  5. select a;  
  6. foreach(var result in query1) {  
  7.     string products = "ProductId : {0}, ProductName: {1}";  
  8.     Console.WriteLine(string.Format(products, result.ProductId, result.ProductName));  
  9. }  
  10. Console.ReadKey();  

Result

LinqToExcel library

Query a specific worksheet by name

By default, the worksheet is named “Sheet1” but if you want to name it with a different name, you can proceed as mentioned below.

 In this example, our worksheet is named “Products” and we want to select all data from Products sheet using LINQ.

Query 2

  1. var query2 = from a in ConxObject.UrlConnexion.Worksheet < Product > ("Products"// Products worksheet  
  2. select a;  
  3. foreach(var result in query2) {  
  4.     string products = "ProductId : {0}, ProductName: {1}";  
  5.     Console.WriteLine(string.Format(products, result.ProductId, result.ProductName));  
  6. }  
  7. Console.ReadKey();  

Result

LinqToExcel library

Another query, if we want to get products which their productName start with J letter

Query 3

  1. var query3 = from p in ConxObject.UrlConnexion.Worksheet < Product > ("Products")  
  2. where p.ProductName.StartsWith("J")  
  3. select new {  
  4.     p.ProductName  
  5. };  
  6. foreach(var result in query3) {  
  7.     string products = "ProductName : {0}";  
  8.     Console.WriteLine(string.Format(products, result.ProductName));  
  9. }  

 Result

LinqToExcel library

Now, we want to create a query that selects all products which lie either in the category of “Cars” or “Clothing”.

Query 4

  1. var query4 = from p in ConxObject.UrlConnexion.Worksheet < Product > ("Products")  
  2. where p.CategoryName.Equals("Cars") || p.CategoryName.Equals("Clothing")  
  3. select new {  
  4.     p.ProductName, p.CategoryName  
  5. };  
  6. foreach(var result in query4) {  
  7.     string products = "ProductName : {0}, CategoryName : {1}";  
  8.     Console.WriteLine(string.Format(products, result.ProductName, result.CategoryName));  
  9. }  
  10. Console.ReadKey();  

 Result

LinqToExcel library

Property to column mapping

To map column names of the worksheet, we have two possibilities, either using AddMapping() method or using the ExcelColumn attribute on the properties.

So, let’s see an example.

AddMapping() Method

  1. ConxObject.UrlConnexion.AddMapping < Product > (x => x.CategoryName, "CategoryName");  
  2. var query5 = from p in ConxObject.UrlConnexion.Worksheet < Product > ("Products")  
  3. where p.CategoryName.Equals("Electronics")  
  4. select p;  
  5. foreach(var result in query5) {  
  6.     string products = "ProductName : {0}, CategoryName : {1}";  
  7.     Console.WriteLine(string.Format(products, result.ProductName, result.CategoryName));  
  8. }  
  9. Console.ReadKey();  

As you can see above, we have mapped CategoryName property of product class with Category Name column of Products worksheet. The query returns all products that have Electronics as the category name.

ExcelColumn Attribute

We can also decorate CategoryName property by ExcelColumn attribute and pass column name which we want to map as an argument.

  1. [ExcelColumn("CategoryName")]  
  2. public string CategoryName {  
  3.     get;  
  4.     set;  
  5. }  

Result

LinqToExcel library

Query a worksheet without a header row

A worksheet that does not have a header row can also be queried using WorksheetNoHeader() method.

Note that the cell values will be referenced by index.

  1. var query6 = from p in ConxObject.UrlConnexion.WorksheetNoHeader()  
  2. where p[2].Equals("Clothing")  
  3. select p[1];  
  4. foreach(var result in query6) {  
  5.     Console.WriteLine(result);  
  6. }  
  7. Console.ReadKey();  

Result

LinqToExcel library

Query a specific range within a worksheet

Here, we can select data between range by using WorksheetRange(startRange, endRange) method

Query 7

  1. var query7 = from c in ConxObject.UrlConnexion.WorksheetRange < Product > ("B1""C8")  
  2. where c.CategoryName == "Clothing"  
  3. select new {  
  4.     c.ProductName, c.CategoryName  
  5. };  
  6. foreach(var result in query7) {  
  7.     string products = "ProductName : {0}, CategoryName : {1}";  
  8.     Console.WriteLine(string.Format(products, result.ProductName, result.CategoryName));  
  9. }  
  10. Console.ReadKey(); 

Result

LinqToExcel library

Query a specific worksheet by index

The worksheets index order is based on their name alphabetically, that means if a spreadsheet contains two worksheets (Product, Category). Even though the Category is the second worksheet in Excel, it is considered the first index.

Query 8

  1. var query8 = from c in ConxObject.UrlConnexion.Worksheet < Product > (1)  
  2. where c.ProductId > 5  
  3. select c;  
  4. foreach(var result in query8) {  
  5.     string products = "Product Id : {0}, Product Name : {1}";  
  6.     Console.WriteLine(string.Format(products, result.ProductId, result.ProductName));  
  7. }  
  8. Console.ReadKey();  

Result

LinqToExcel library

Query WorkSheet Names

To get all worksheet names in a spreadsheet, we will use the GetWorksheetNames() method

  1. var worksheetNames = ConxObject.UrlConnexion.GetWorksheetNames();  
  2. foreach(var result in worksheetNames) {  
  3.     Console.WriteLine(result);  
  4. }  
  5. Console.ReadKey();  

Result

LinqToExcel library

Query Column Names

To select all column names in a worksheet. We will use the GetColumnNames() method.

  1. var columnNames = ConxObject.UrlConnexion.GetColumnNames("Products");  
  2. foreach(var result in columnNames) {  
  3.     Console.WriteLine(result);  
  4. }  

Result

LinqToExcel library

Persistent Connection

If you want to use the same connection on all queries performed by the IExcelQueryFactory, then set the UsePersistentConnection property to true.

Make sure to dispose of the ExcelQueryFactory if you use a persistent connection.

  1. ConxObject.UrlConnexion.UsePersistentConnection = true;  
  2. try {  
  3.     var query9 = from c in ConxObject.UrlConnexion.Worksheet < Product > (1)  
  4.     where c.ProductId > 5  
  5.     select c;  
  6. catch (Exception) {  
  7.     throw;  
  8. finally {  
  9.     ConxObject.UrlConnexion.Dispose();  
  10. }  

That’s all. Please share your feedback and queries in the comments box.