TransposeBy - Extending Excel With C# And Excel-DNA

Intro

At some point, I think all of us have received an Excel workbook filled with badly formatted data that we somehow have to transform into a neatly formatted report. Several times, I've gotten a column of data that contains records comprising multiple fields, like this.

C#

 

When what I really need is this,

C#

Excel does contain an inbuilt function called Transpose but it’s quite limited in that it just flips a row to a column and vice versa.  As I don’t want to have to reformat the data by hand, preferring an automated solution, I have a couple of options open to me.  I can create a VBA based macro or an Office Add-In, but I don’t really want to go that route.  Macros can be difficult to add to every workbook I use, and the Add-In will require me to add a button to the tool ribbon which I want to avoid.  The last option is to create a custom User Defined Function that will be available to me whenever I use Excel, which as it turns out, is very easy to do in a C# project thanks to an open source framework called Excel-DNA.

Excel-DNA makes the creation of a .xll, the Excel add-in format (It’s similar to a .dll but specific to Excel) extremely simple.  There are commercial solutions available, but you’ll probably only ever need this framework.

Set-up

The function I ended up creating is called TransposeBy and I’ll be using the project code to illustrate this article.  That source code can be found on GitHub in this repository.  Creation of the project is pretty easy so I’m going to assume you can handle the following steps:

  1. Create a C# class library project in Visual Studio.
  2. Using NuGet, install the following libraries:

 

    • ExcelDNA.AddIn
    • ExcelDNA.Integration
    • ExcelDNA.Intellisense (Optional)

 

And now, you’re ready to go.

The Solution

So what does this function need to do to be able to accomplish the outcome I need"?  It’ll have to:

  • Accept an Excel Range, either a column or a row, specifying the source data I wish to transpose.
  • Optionally, accept a Boolean flag to switch the transposition from being by column to being by row.
  • Exit if the function was not called as an Array Formula in Excel (If you’re unfamiliar with Array Formula, you can find a primer on them here)
  • Perform validation on the source data.
  • Initialize and fill an output object array.

The Code

In my sample project, you’ll see a file called TransposeByUDF.cs.  This is just a standard class file which inherits the XlCall class in ExcelDNA.Integration and references the Namespace ExcelDna.Integration.  I’ve also referenced the Namespace ExcelDna.Intellisense so that users can get some additional information on the function in Excel but it’s not necessary for a simple function.

  1. usingIntegration;    
  2. usingIntelliSense;    
  3. namespaceTransposeBy    
  4. {    
  5.    public class TransposeByUDF : XlCall    
  6.    {   

 The next step is to create a static function that returns an object and to attach the following directive to the function.  The ExcelFunction directive contains a number of options but I’ve only used the description option to give Excel users some information on the function.  I’m returning an object as my function will be filling in a cell range but you can return strings, numbers etc. as appropriate.  In the function declaration, I’ve also added a directive to the two parameters to give readable parameter names and descriptions to the Excel users.

  1. [ExcelFunction(Description = "Transpose a range of values breaking every n number of rows/columns.")]    
  2. publicstatic object TransposeBy(    
  3.    [ExcelArgument(Name = "SourceData", Description = "The range of cells to be transposed.")] objectoSource,    
  4. [ExcelArgument(Name = "ByRow", Description = "Optional flag to force transposing vertically insted of the horizontal default.")] [Optional] boolbByRow    
  5. )    
  6. {   

Now, that the function is all set up, you can use the many objects that Excel-DNA exposes to interact with the parent Excel application.  Documentation in the Excel-DNA project isn’t perfect but there are extensive example projects and an archive of the Google Groups forum that can be searched.  For my function, I first needed to get the range of cells that was selected when the function was called and this was made available to me thru the XlCall.Excel object.  This object accepts an Enum as a parameter that allows you to define what type of information you want to retrieve.  In my case, I’ve used the value xlfCaller to reference the destination cell range. 

  1. var oCaller = Excel(xlfCaller) asExcelReference;    
  2. if(oCaller == null)    
  3. {    
  4.    returnnew object[0, 0];    
  5. }   

 If the range is null then I just return a blank object but I can also return Excel specific errors like #REF or #VALUE.  In the next couple of lines in the function I test to see if the destination range that was passed is just a single row or column of data and that the function was called as an array function.  If it fails either of those tests then I throw back some Excel specific errors.  I also wrapped the entire function in a try..catch block that if triggered throws back a #REF error to Excel.

  1. // Test that the source is a single column or row of values, the destination is an array function etc.  
  2. if(oCaller.RowFirst == oCaller.RowLast && oCaller.ColumnFirst == oCaller.ColumnLast) { returnExcelErrorRef; }  // Formula has not been entered as an Array formula    
  3. if(((System.Array)oSource).GetLength(0) > 1 && ((System.Array)oSource).GetLength(1) > 1) { returnExcelErrorValue; } // Source data is not a single column or row    

 As the function is called from Excel as an array formula I have to return an array object that matches the dimensions of the range of cells that were selected when the function was called.  Whilst the exact dimensions aren’t available in the ExcelReference object I created, the top and bottom row and column references are so a simple calculation can be done to get the required sizes.

  1. // Initialise the output result array  
  2. object[, ] oResult = newobject[(oCaller.RowLast - oCaller.RowFirst) + 1, (oCaller.ColumnLast - oCaller.ColumnFirst) + 1];  
  3. Fill("");  

 To make life easier for me I added a separate class file to the project so I could extend object arrays to have a method .Fill() that auto-initializes each element in the array with a specified parameter.  You should auto-initialize with a blank value to avoid null values being displayed in your cell range and you can find that extension in the MyArrayExtentions.cs file in the project.

The rest of the code in my sample function just deals with transposing the single row or column of data so that each element is filled in from left to right, line by line into the output array.  It’s pretty simple code and doesn’t relate directly to Excel-DNA so I’m not going to go into detail of it here.  The only important line is the one that returns, in my case, the object array at the end.  As mentioned, you can return a string and a numeric as appropriate to your function.

Compiling and Testing

Adding the ExcelDNA.AddIn library from NuGet to your project results in a few changes to the solution.  Firstly, you’ll notice a file with the extension .dna in your project.  In my sample project, it’s called TransposeBy-AddIn.dna and it acts as a configuration file when distributing your custom function.  You probably won’t need to make any changes to it unless you’re using the ExcelDNA.Intellisense library in which case you’ll have to add a line like I did at line 4 in the following code,

  1. <DnaLibrary Name="TransposeBy Add-In" RuntimeVersion="v4.0">    
  2.   <ExternalLibrary Path="TransposeBy.dll" ExplicitExports="false" LoadFromBytes="true" Pack="true" />    
  3.     
  4.   <Reference Path="ExcelDna.IntelliSense.dll" Pack="true" />    
  5.       
  6.   <!--     
  7.        The RuntimeVersion attribute above allows two settings:    
  8.        * RuntimeVersion="v2.0" - for .NET 2.0, 3.0 and 3.5    
  9.        * RuntimeVersion="v4.0" - for .NET 4 and 4.5    
  10.     
  11.        Additional referenced assemblies can be specified by adding 'Reference' tags.     
  12.        These libraries will not be examined and registered with Excel as add-in libraries,     
  13.        but will be packed into the -packed.xll file and loaded at runtime as needed.    
  14.        For example:    
  15.            
  16.        <Reference Path="Another.Library.dll" Pack="true" />    
  17.       
  18.        Excel-DNA also allows the xml for ribbon UI extensions to be specified in the .dna file.    
  19.        See the main Excel-DNA site at http://excel-dna.net for downloads of the full distribution.    
  20.   -->    
  21.     
  22. </DnaLibrary>    

 The second major change you’ll probably see is that the Debug property page in the project's properties now references your installation of Excel (if present) with the .xll of your project referenced as a command line parameter. 

C#

If you have Excel installed, and I recommend that you do if you’re developing add-ins, you’ll be able to debug your custom function just by hitting F5.  Excel will start with your custom function loaded and by using breakpoints you can step thru your code as you develop and debug. 

Deploying

Distributing your custom function can be done by creating an installer and the Excel-DNA project contains a template for creating a WiX-based installer.  This can be found here.  If you just want to quickly install it into your own installation of Excel you can do this by opening the Excel Add-Ins section in Excel Options and browsing to where you stored the .xll file. 

C#