Linq To Excel In Action

Introduction

 
In this post, we will demonstrate how we will use the LinqToExcel library to query data from Excel.
 
Previously, I didn’t have any idea about creating queries using LINQ from excel spreadsheets, but when I searched, I found LinqToExcel library which is available at paulyoder GitHub repository. This library, as we will see later, presents so many mechanisms that enable us to use LINQ in order to build the queries and get the data from Excel spreadsheets.
 
I'd like to point out that the LinqToExcel library is available in both 32bits and 64 bits. You can select one of them that matches your application settings. I hope you will like it.
 
LinqToExcel library
 
In this post, we are going to,
  • Create a Console application.
  • Install LinqToExcel Packages.
  • Create ConnexionExcel class.
  • Queries with LinqToExcel.
Create a Console application
 
Open Visual Studio and select File >> New Project.
 
The "New Project" window will pop up. Select Console App (.NET Framework), name your project, and click OK.
 
LinqToExcel library
 
Install LinqToExcel Packages
 
At this level, we have 3 possibilities to install Linq to Excel packages.
  • Using NuGet Manager
     
    In Solution Explorer, right-click on References >> Manage NuGet Packages.
     
    LinqToExcel library
     
    Next, type LinqToExcel in the search text box, select the first line as shown above, and click on the "Install" button.
  • Using the Package Manager Console
     
    Enter the following command in the package manager console.
     
    LinqToExcel library
  • The last solution is to add LinqToExcel references directly in your project. You can download them from the following here.
Let’s start discovering some features of the LinqToExcel library.
 
Before of all, create a new Excel file in your local machine, and name it as you like. 
 
Next, you can add the following data rows into the first sheet as shared below.
 
LinqToExcel library
 
Create ConnexionExcel class
 
Here, we created ConnexionExcel class which contains two properties.
  • PathExcelFile: contains the location path of the excel file.
  • UrlConnexion: gets an instance of ExcelQueryFactory.
  1. public class ConnexionExcel {  
  2.     public string _pathExcelFile;  
  3.     public ExcelQueryFactory _urlConnexion;  
  4.     public ConnexionExcel(string path) {  
  5.         this._pathExcelFile = path;  
  6.         this._urlConnexion = new ExcelQueryFactory(_pathExcelFile);  
  7.     }  
  8.     public string PathExcelFile {  
  9.         get {  
  10.             return _pathExcelFile;  
  11.         }  
  12.     }  
  13.     public ExcelQueryFactory UrlConnexion {  
  14.         get {  
  15.             return _urlConnexion;  
  16.         }  
  17.     }  
  18. }  
Query a worksheet with a header row
 
We are creating a query that returns data from the worksheet by using the header row. We should be using a class that has the following properties that match with column names of the worksheet. The compiler expects the worksheet must be named “Sheet1”.
 
So let’s create product class with the following properties.
  1. public class Product {  
  2.     public int ProductId {  
  3.         get;  
  4.         set;  
  5.     }  
  6.     public string ProductName {  
  7.         get;  
  8.         set;  
  9.     }  
  10.     public string CategoryName {  
  11.         get;  
  12.         set;  
  13.     }  
  14. }  
Query 1
  1. string pathToExcelFile = @ "D:\ExcelData.xlsx";  
  2. ConnexionExcel ConxObject = new ConnexionExcel(pathToExcelFile);  
  3. //Query a worksheet with a header row  
  4. var query1 = from a in ConxObject.UrlConnexion.Worksheet < Product > ()  
  5. select a;  
  6. foreach(var result in query1) {  
  7.     string products = "ProductId : {0}, ProductName: {1}";  
  8.     Console.WriteLine(string.Format(products, result.ProductId, result.ProductName));  
  9. }  
  10. Console.ReadKey();  
Result
 
LinqToExcel library
 
Query a specific worksheet by name
 
By default, the worksheet is named “Sheet1” but if you want to name it with a different name, you can proceed as mentioned below.
 
In this example, our worksheet is named “Products” and we want to select all data from the Products sheet using LINQ.
 
Query 2
  1. var query2 = from a in ConxObject.UrlConnexion.Worksheet < Product > ("Products"// Products worksheet  
  2. select a;  
  3. foreach(var result in query2) {  
  4.     string products = "ProductId : {0}, ProductName: {1}";  
  5.     Console.WriteLine(string.Format(products, result.ProductId, result.ProductName));  
  6. }  
  7. Console.ReadKey();  
Result
 
LinqToExcel library
 
Another query, if we want to get products which their productName start with J letter
 
Query 3
  1. var query3 = from p in ConxObject.UrlConnexion.Worksheet < Product > ("Products")  
  2. where p.ProductName.StartsWith("J")  
  3. select new {  
  4.     p.ProductName  
  5. };  
  6. foreach(var result in query3) {  
  7.     string products = "ProductName : {0}";  
  8.     Console.WriteLine(string.Format(products, result.ProductName));  
  9. }  
 Result
 
LinqToExcel library
 
Now, we want to create a query that selects all products which lie either in the category of “Cars” or “Clothing”.
 
Query 4
  1. var query4 = from p in ConxObject.UrlConnexion.Worksheet < Product > ("Products")  
  2. where p.CategoryName.Equals("Cars") || p.CategoryName.Equals("Clothing")  
  3. select new {  
  4.     p.ProductName, p.CategoryName  
  5. };  
  6. foreach(var result in query4) {  
  7.     string products = "ProductName : {0}, CategoryName : {1}";  
  8.     Console.WriteLine(string.Format(products, result.ProductName, result.CategoryName));  
  9. }  
  10. Console.ReadKey();  
Result
 
LinqToExcel library
 
Property to column mapping
 
To map column names of the worksheet, we have two possibilities, either using the AddMapping() method or using the ExcelColumn attribute on the properties.
 
So, let’s see an example.
 
AddMapping() Method
  1. ConxObject.UrlConnexion.AddMapping < Product > (x => x.CategoryName, "CategoryName");  
  2. var query5 = from p in ConxObject.UrlConnexion.Worksheet < Product > ("Products")  
  3. where p.CategoryName.Equals("Electronics")  
  4. select p;  
  5. foreach(var result in query5) {  
  6.     string products = "ProductName : {0}, CategoryName : {1}";  
  7.     Console.WriteLine(string.Format(products, result.ProductName, result.CategoryName));  
  8. }  
  9. Console.ReadKey();  
As you can see above, we have mapped the CategoryName property of the product class with the Category Name column of the Products worksheet. The query returns all products that have Electronics as the category name.
 
ExcelColumn Attribute
 
We can also decorate CategoryName property by ExcelColumn attribute and pass column name which we want to map as an argument.
  1. [ExcelColumn("CategoryName")]  
  2. public string CategoryName {  
  3.     get;  
  4.     set;  
  5. }  
Result
 
LinqToExcel library
 
Query a worksheet without a header row
 
A worksheet that does not have a header row can also be queried using WorksheetNoHeader() method.
 
Note that the cell values will be referenced by the index.
  1. var query6 = from p in ConxObject.UrlConnexion.WorksheetNoHeader()  
  2. where p[2].Equals("Clothing")  
  3. select p[1];  
  4. foreach(var result in query6) {  
  5.     Console.WriteLine(result);  
  6. }  
  7. Console.ReadKey();  
Result
 
LinqToExcel library
 
Query a specific range within a worksheet
 
Here, we can select data between range by using WorksheetRange(startRange, endRange) method
 
Query 7
  1. var query7 = from c in ConxObject.UrlConnexion.WorksheetRange < Product > ("B1""C8")  
  2. where c.CategoryName == "Clothing"  
  3. select new {  
  4.     c.ProductName, c.CategoryName  
  5. };  
  6. foreach(var result in query7) {  
  7.     string products = "ProductName : {0}, CategoryName : {1}";  
  8.     Console.WriteLine(string.Format(products, result.ProductName, result.CategoryName));  
  9. }  
  10. Console.ReadKey(); 
Result
 
LinqToExcel library
 
Query a specific worksheet by index
 
The worksheets index order is based on their name alphabetically, which means if a spreadsheet contains two worksheets (Product, Category). Even though Category is the second worksheet in Excel, it is considered the first index.
 
Query 8
  1. var query8 = from c in ConxObject.UrlConnexion.Worksheet < Product > (1)  
  2. where c.ProductId > 5  
  3. select c;  
  4. foreach(var result in query8) {  
  5.     string products = "Product Id : {0}, Product Name : {1}";  
  6.     Console.WriteLine(string.Format(products, result.ProductId, result.ProductName));  
  7. }  
  8. Console.ReadKey();  
Result
 
LinqToExcel library
 
Query WorkSheet Names
 
To get all worksheet names in a spreadsheet, we will use the GetWorksheetNames() method
  1. var worksheetNames = ConxObject.UrlConnexion.GetWorksheetNames();  
  2. foreach(var result in worksheetNames) {  
  3.     Console.WriteLine(result);  
  4. }  
  5. Console.ReadKey();  
Result
 
LinqToExcel library
 
Query Column Names
 
To select all column names in a worksheet. We will use the GetColumnNames() method.
  1. var columnNames = ConxObject.UrlConnexion.GetColumnNames("Products");  
  2. foreach(var result in columnNames) {  
  3.     Console.WriteLine(result);  
  4. }  
Result
 
LinqToExcel library
 
Persistent Connection
 
If you want to use the same connection on all queries performed by the IExcelQueryFactory, then set the UsePersistentConnection property to true.
 
Make sure to dispose of the ExcelQueryFactory if you use a persistent connection.
  1. ConxObject.UrlConnexion.UsePersistentConnection = true;  
  2. try {  
  3.     var query9 = from c in ConxObject.UrlConnexion.Worksheet < Product > (1)  
  4.     where c.ProductId > 5  
  5.     select c;  
  6. catch (Exception) {  
  7.     throw;  
  8. finally {  
  9.     ConxObject.UrlConnexion.Dispose();  
  10. }  
That’s all. Please share your feedback and queries in the comments box.