Word automation using C#

1. Development Tools Used

  • Microsoft Visual Studio 2005
  • Microsoft Word 2003
  • Programming Language: C#

2.  Word Automation using C#

Word Automation through C# is all about programmatically generating the Word Document using C# code. Working on Word is considered to be straightforward, but doing the same programmatically gets a little intricate. Word automation almost completely involves working with objects and reference types. Almost all of the tasks which we perform on word 2003 can be done programmatically using C# or VB. Tasks like Inserting Table of Contents, Linking documents, Mail Merge, Inserting Documents, Embedding documents, inserting pictures, watermark... etc can all be done programmatically. 

3. Setting Up Work Environment

Starting off, the first step is to include the Word DLL's to the Solution. This can be done by right clicking the Reference Folder in the Solution explorer of the project and select Add Reference.

Figure 1.

Browse Through the available COM objects and Select Microsoft Office 11.0 Object Library & Microsoft Word 11.0 Object Library. This DLL has all the methods which we do to perform the automation.

Note: This DLL would be present only if Microsoft Office is installed on the Machine.

Also include "using Microsoft.Office;" in the Namespaces used.

Figure 2.

Figure 3.

4. Objects Used in Automation

All the methods used Word automation is derived either from Word.Application or Word.Document class.

Let's consider that we want to create a document using the Word Application, we might end up doing the following steps,

  1. Open Word Application. (Opening Word Application creates a new document by default, but in Automation, wee need to manually add a document)
  2. Add a New document.
  3. Edit the document.
  4. Save it.

The same steps needs to be done programmatically. The Word.Application and Word.Document are used to Open Word and add a new Document to it.

4.1 Word.Application

This represents in Word Application without any new document loaded in it. This is like the base class which is needed to create a new document. Creating a new instance of Word.Application can be visualized as below.

Figure 4.

4.2 Word.Document

If we need to add a new document file, first we have to create an instance of the Word.Document object and then add it to the Word.Application

Object oMissing = System.Reflection.Missing.Value();  
Object oTrue = true;  
Object oFalse = false;  
Word.Application oWord = new Word.Application();  
Word.Document oWordDoc = new Word.Document();  
oWord.Visible = true;  
oWordDoc = oWord.Documents.Add(ref oMissing, ref oMissing, ref oMissing, ref oMissing);  

This triggers the following operation in the Word Application


Figure 5.

Approaches to Perform Automation

  1. We can either have a base template (.dot) file and open the base template file and work on it.
  2. We can otherwise build a word document from scratch. 

4.3 Standard Input Parameters

Most of the methods have input parameters which are of reference type, and the values are mostly true, false or missing (null). In automation it makes sense as to why most of the input parameters are of reference types; it might be because of the fact that most of the methods a multitude of input parameters (many have more than 10 input parameters) and their value is going to be either true, false or missing in most of the cases. So instead of supplying the same input parameter ten times, we can make all the input parameters point to the location same single variable in them memory. 

4.3.1 Range Object

While we work on Word Application, if we want to type some text in the 11th line, then we manually take the cursor and click it on the required line and then start typing. In order to do the same task, we use the Range variable in C#. The range variable of the Word.Document object represents the location of the cursor on the current document.

There are many possible ways to point to a specific location on a document. I had extensively used the Bookmarks locators as I work on Automation using a base template. In this approach, we insert Bookmarks on the base template and we programmatically locate those Bookmarks, set the range on them and insert text or documents at that specific location. There are also many other possible ways to set the range.

Object oBookMarkName = "My_Inserted_Bookmark_On_Template";  
Word.Range wrdRange = oWordDoc.Bookmarks.get_Item(ref oBookMarkName).Range.Select();  

4.3.2 Selection Object

While working on word, we select a range of text by clicking and dragging the mouse pointer across contents in the document to select it. The contents can be text, formatted text, tables or any other item in the document. We programmatically represent the same by using the Selection Object derived from Word.Selection. In the previous range example, we locate a bookmark and set the range on that specific bookmark and we select it. Now the selection object represents that specific location. It's like placing the cursor on that specific bookmark location on the document. The selection across text can be done by selecting a range of text in between two ranges. Then the selected range can be copied, deleted or formatted.

4.3.3 Selecting Between Bookmarks

Object oBookmarkStart = "BookMark__Start";  
Object oRngoBookMarkStart = oWordDoc.Bookmarks.get_Item(ref oBookmarkDesignInfoStart).Range.Start;  
Object oBookmarkEnd = "BookMark__End";  
Object oRngoBookMarkEnd = oWordDoc.Bookmarks.get_Item(ref oBookmarkDesignInfoEnd).Range.Start;  
Word.Range rngBKMarkSelection = oWordDoc.Range(ref oRngoBookMarkStart, ref oRngoBookMarkEnd);  
rngBKMarkSelection.Delete(ref oMissing, ref oMissing);  

5. Automation using a Base Template

The base template file method is preferable as it gives us much more flexibility in performing the automation and it comes very handy for performing Mail Merge.

In the base template method, when we call the Documents.Add method of the Application object, we give the path of the .dot file.

Object oTemplatePath = "C:\\Program Files\\MyTemplate.dot";  
oWordDoc = oWord.Documents.Add(ref oTemplatePath, ref oMissing, ref oMissing, ref oMissing);  

Now .dot file is opened and when we save the generated document, we save it as a new file. 

6. Mail Merge

Mail merge is a useful tool in scenarios where we want to randomly generate alike documents where just a few fields change. For instance in a pay slip which has a base template and just the employee name, number and pay details needs to change for each employee. Now we can have a base template which is a word file saved as Document Template file.

In the .dot file, insert a Mail Merge Field manually by placing the cursor in the required position and Insert -> Field, and in Field Names, select "MergeField", now the Mail merged field would be represented by <<FieldName>>. The template can be like

Contact Information

For further information and discussions, please contact:

Name: <<CIFLName>>
Address: <<CIAddress>>
Phone:  <<CIPhW>> (Work)
Fax:      <<CIFax>>
Email    <<CIMail>>

Now for programmatically replacing the Mail Merge fields using the code, the document by default has many fields in it. But the user entered fields comes with a prefix and suffix which can be can be used as an identifier to replace the fields.  

Object oMissing = System.Reflection.Missing.Value();  
Object oTrue = true;  
Object oFalse = false;  
Word.Application oWord = new Word.Application();  
Word.Document oWordDoc = new Word.Document();  
oWord.Visible = true;  
Object oTemplatePath = "C:\\Program Files\\MyTemplate.dot";  
oWordDoc = oWord.Documents.Add(ref oTemplatePath, ref oMissing, ref oMissing, ref oMissing);  
foreach (Word.Field myMergeField in oWordDoc.Fields)  
    Word.Range rngFieldCode = myMergeField.Code;  
    String fieldText = rngFieldCode.Text;  
    if (fieldText.StartsWith(" MERGEFIELD"))  
        // MERGEFIELD  MyFieldName  \\* MERGEFORMAT  
        Int32 endMerge = fieldText.IndexOf("\\");  
        Int32 fieldNameLength = fieldText.Length - endMerge;  
        String fieldName = fieldText.Substring(11, endMerge - 11);  
        fieldName = fieldName.Trim();  
        if (fieldName == "MyField")  
            oWord.Selection.TypeText("This Text Replaces the Field in the Template");  

There is one other method for replacing the Merge Fields which is mentioned in msdn, which uses a rather memory hungry approach. In that method a separate document is opened and it is inserted with a table which has first row as the Mail Merge Field Name and the second row as the replacement value, then the value from the table is matched with that of the original document and replacement occurs and the second document is purged.

7. Embedding a Document

Embedding a document is done through the application by

Insert-> Object-> Create from file-> Select the File-> Display as Icon. This embeds the file in the selected location as an icon and the user can double click on the icon to open the file. The same can be done through automation.

The range supposed to set at the required place and the same has to be selected (range can be set by any of the means mentioned above). Now with the selection, the file can be embedded.

Object oIconLabel = "File Name";  
Object oIconFileName = "C:\\Document and Settings\\IconFile.ico";  
Object oBookMark = "My_Custom_BookMark";  
Object oFileDesignInfo = "C:\\Document and Settings\\somefile.doc";  
Object oClassType = "Word.Document.8";  
Object oTrue = true;  
Object oFalse = false;  
Object oMissing = System.Reflection.Missing.Value;  
oWordDoc.Bookmarks.get_Item(ref oBookMark).Range.InlineShapes.AddOLEObject(  
    ref oClassType,ref oFileDesignInfo,ref oFalse, ref oTrue, ref oIconFileName,  
    ref oMissing,ref oIconLabel, ref oMissing);

8. Inserting a Document File

Contents of a Word documents can also be inserted into the current document from the application by doing the following.

Insert -> File -> Select the File. This extracts the contents from the selected file and inserts it into the current document.

In automation, we need to follow a similar approach by placing the range at the required point and selecting it and then inserting the file.

String oFilePath = "C:\\Document and Settings\\somefile.doc";  
oWordDoc.Bookmarks.get_Item(ref oBookMark).Range.InsertFile(oFilePath,ref oMissing, ref oFalse, ref oFalse, ref oFalse);  

9. Including Water Marks/Pictures in the Document Background

Including watermarks is one other important feature for any official documents as the watermark may have the company's logo, draft logo or any other picture/text. This is useful when we want a picture or some text to be present throughout the document in the background.

We insert a watermark in the application by performing the following tasks.

Format -> Background -> Printed Watermarks

The same can also be done programmatically; moreover as we manually define the values like the angle of tilt and actual location of the watermark, we have more flexibility in defining the exact location of the watermark.

9.1 Embedding Pictures in Document Header

oWord.ActiveWindow.ActivePane.View.SeekView = Word.WdSeekView.wdSeekCurrentPageHeader;  
Word.Shape logoCustom = null;  
String logoPath = "C:\\Document and Settings\\MyLogo.jpg";  
logoCustom = oWord.Selection.HeaderFooter.Shapes.AddPicture(logoPath,  
    ref oFalse, ref oTrue, ref oMissing, ref oMissing, ref oMissing, ref oMissing, ref oMissing);  
logoCustom.Select(ref oMissing);  
logoCustom.Name = "CustomLogo";  
logoCustom.Left = (float)Word.WdShapePosition.wdShapeLeft;  
oWord.ActiveWindow.ActivePane.View.SeekView = Word.WdSeekView.wdSeekMainDocument;   

9.2 Inserting Text in the Centre of the Document as Water Mark

Word.Shape logoWatermark = null;  
logoWatermark = oWord.Selection.HeaderFooter.Shapes.AddTextEffect(  
    "Enter The Text Here", "Arial", (float)60,  
    0, 0, ref oMissing);   
logoWatermark.Select(ref oMissing);  
logoWatermark.Fill.Visible = Microsoft.Office.Core.MsoTriState.msoTrue;  
logoWatermark.Line.Visible = Microsoft.Office.Core.MsoTriState.msoFalse;  
logoWatermark.Fill.ForeColor.RGB = (Int32)Word.WdColor.wdColorGray30;  
logoWatermark.RelativeHorizontalPosition = Word.WdRelativeHorizontalPosition.wdRelativeHorizontalPositionMargin;  
logoWatermark.RelativeVerticalPosition = Word.WdRelativeVerticalPosition.wdRelativeVerticalPositionMargin;  
logoWatermark.Left = (float)Word.WdShapePosition.wdShapeCenter;  
logoWatermark.Top = (float)Word.WdShapePosition.wdShapeCenter;  
logoWatermark.Height = oWord.InchesToPoints(2.4f);  
logoWatermark.Width = oWord.InchesToPoints(6f);  
oWord.ActiveWindow.ActivePane.View.SeekView = Word.WdSeekView.wdSeekMainDocument;  

9.3 Inserting Text in the Centre of Page, and rotating it by 90 Degrees

Word.Shape midRightText;  
midRightText = oWord.Selection.HeaderFooter.Shapes.AddTextEffect(  
    "Text Goes Here", "Arial", (float)10,  
    0, 0, ref oMissing);  
midRightText.Select(ref oMissing);  
midRightText.Name = "PowerPlusWaterMarkObject2";  
midRightText.Fill.Visible = Microsoft.Office.Core.MsoTriState.msoTrue;  
midRightText.Line.Visible = Microsoft.Office.Core.MsoTriState.msoFalse;  
midRightText.Fill.ForeColor.RGB = (int)Word.WdColor.wdColorGray375;  
midRightText.Rotation = (float)90;  
midRightText.RelativeHorizontalPosition =  
midRightText.RelativeVerticalPosition =  
midRightText.Top = (float)Word.WdShapePosition.wdShapeCenter;  
midRightText.Left = (float)480;  

10. Including Page Numbers in Page Footer

Including auto-generated page numbers in the Footer is yet another useful feature which can be simulated in the code.

oWord.ActiveWindow.ActivePane.View.SeekView = Word.WdSeekView.wdSeekCurrentPageFooter;  
String docNumber = "1";  
String revisionNumber = "0";  
oWord.Selection.Paragraphs.Alignment = Word.WdParagraphAlignment.wdAlignParagraphLeft;  
oWord.ActiveWindow.Selection.Font.Name = "Arial";  
oWord.ActiveWindow.Selection.Font.Size = 8;  
oWord.ActiveWindow.Selection.TypeText("Document #: " + docNumber + " - Revision #: " + revisionNumber);  
oWord.ActiveWindow.Selection.TypeText("Page ");  
Object CurrentPage = Word.WdFieldType.wdFieldPage;  
oWord.ActiveWindow.Selection.Fields.Add(oWord.Selection.Range, ref CurrentPage, ref oMissing, ref oMissing);  
oWord.ActiveWindow.Selection.TypeText(" of ");  
Object TotalPages = Word.WdFieldType.wdFieldNumPages;  
oWord.ActiveWindow.Selection.Fields.Add(oWord.Selection.Range, ref TotalPages, ref oMissing, ref oMissing);  
oWord.ActiveWindow.ActivePane.View.SeekView = Word.WdSeekView.wdSeekMainDocument;  

11. Basic Text Formatting Options

11.1 Paragraph Break

This is equivalent to hitting the enter button in the document. 


11.2 Text Formatting Option

All the text formatting options available in the Word Application can also be replicated through automation.

oWord.Selection.Font.Bold = 1;  
oWord.Selection.Font.Color = Word.WdColor.wdColorAqua;  
oWord.Selection.Font.Italic = 1;  
oWord.Selection.Font.Underline = Word.WdUnderline.wdUnderlineDashHeavy;  

11.3 Clear Formatting

When the Formatting is applied to a selection, then the same formatting gets carried on to the next lines, in order to clear the formatting, the next line needs to be selected and ClearFormatting() method needs to be called.


12. Table of Contents

Table of Contents is very handy when it comes to official documents or some technical papers which span across many pages. Table of contents can be inserted and updated on the fly as the document gets built.

For the Table of Contents to get auto generated without any hassles, it is vital that the Headings, Sub-Headings and the Body text have their respective attributes set. When we work on the application, the values get set by themselves, we only need to edit if required. But while programming its mandatory that we set the values in the code in order to prevent any anomalies when the Table of Contents gets updated.

Below is an example of a document which was programmatically generated.

Figure 6.

It is apparent that the Header 2 and Header 3 and Body are formatted differently and even in the Table of Contents the Header 2 is slightly offset from the Header 1.

Open the above document and Outlining Tool bar, View -> Toolbars -> Outlining. And on moving the cursor on the Sample Header 2, we can see that the Format is Heading 2 and Outlining level is Level 2.

Figure 7.

And for Body, the Format is Normal + Arial, 10 pt and Outlining Level is Body text.

Figure 8.

The same values needs to be set programmatically for the Table of Contents to get generated.

12.1 Section Format

For setting the Format of the Selection, select the entire text (select between bookmarks like mentioned before in Selection section) and set the value

Object styleHeading2 = "Heading 2";  
Object styleHeading3 = "Heading 3";  
oWord.Selection.Range.set_Style(ref styleHeading2);  
oWord.Selection.Range.set_Style(ref styleHeading3);  

12.2 Outline Level

For setting the outline level, select the contents and set it to one of the values mentioned below

oWord.Selection.Paragraphs.OutlineLevel =Word.WdOutlineLevel.wdOutlineLevel2;
oWord.Selection.Paragraphs.OutlineLevel = Word.WdOutlineLevel.wdOutlineLevel3;
oWord.Selection.Paragraphs.OutlineLevel = Word.WdOutlineLevel.wdOutlineLevelBodyText;

12.3: Inserting Table of Contents

Once the Outline Levels & Section Style are set, the Table of Contents can be inserted programmatically and the page numbers gets populated automatically based on the Outline Levels & Section Style set by the user. (Also refer this MSDN Link)

Object oBookmarkTOC = "Bookmark_TOC";  
Word.Range rngTOC = oWordDoc.Bookmarks.get_Item(ref oBookmarkTOC).Range;  
Object oUpperHeadingLevel = "1";  
Object oLowerHeadingLevel = "3";  
Object oTOCTableID = "TableOfContents";  
oWordDoc.TablesOfContents.Add(rngTOC, ref oTrue, ref oUpperHeadingLevel,  
    ref oLowerHeadingLevel,ref oMissing, ref oTOCTableID, ref oTrue,  
    ref oTrue, ref oMissing, ref oTrue, ref oTrue, ref oTrue);  

12.4 Updating Table of Contents

Usually the Table of Contents is inserted in the beginning of the document generation and once all the contents are populated, the locations of the Headings and Sub Headings tend to change. If the Table of Contents is not updated, then its contents points to different pages. To overcome this hassle, the Table of Contents needs to be updated at the end of the Automation.


13. Saving/Closing & Re-Opening the File

13.1 Saving the File

Object oSaveAsFile = (Object)"C:\\SampleDoc.doc";  
oWordDoc.SaveAs(ref oSaveAsFile, ref oMissing, ref oMissing, ref oMissing,  
    ref oMissing, ref oMissing,ref oMissing, ref oMissing, ref oMissing,  
    ref oMissing, ref oMissing, ref oMissing, ref oMissing, ref oMissing,  
    ref oMissing, ref oMissing);  

13.2 Closing the File

oWordDoc.Close(ref oFalse, ref oMissing, ref oMissing);  
oWord.Quit(ref oMissing, ref oMissing, ref oMissing);  

13.3 Re-Opening the File

The Open () method which we use in Word2003 dll might throw an exception if the client have another version of word installed in their machine. If the client has Word 2002, then he has to open a word file only by Open2002 () method. Open () method which comes for Word 2003 might through an exception in Word 2002 environment.  And for Word 2000, there is a method called Open2000 () and Open2002 () for Office 2002 and so on. So it is wise to put the Open () in a try-catch block as mentioned below.

Figure 10.

14. Tips for Word Automation to Create New Document (Non-Base Template Approach)

(Refer to this MSDN link)

When we proceed to create a New Document without using the Base Template, the most useful entity is the inbuilt Bookmark endofdoc. It would be a build-from-scratch approach where the programmer starts of the automation by inserting his first section of contents, then setting the range to point to the endofdoc Bookmark and selecting it and inserting his contents and again selecting the endofdoc which would be pointing to the end of the document which would now be after the two sections.

Similar Articles