How to manipulate Word images programmatically
Working with micorsoft word images programmatically was never so easy. Now with the availability of Aspose.Words .Net component, it has become much easier to manipulate the images in a word document.
I came across a problem in my project in which I had a user given input word Doc file and I had to extract all the images in the DOC document and save them as files in a folder. The basic purpose of the activity was to get instant access to all the images contained in DOC document without opening the DOC document using Microsoft word and user could open the images using any image viewer software available so that he could play with the images.
First problem was that I had to extract all the images contained in the document. I had to use the component using the C# language for this purpose. So all the code discussed here will be in C# syntax.
Step 1: Create a new C# project
To create a new project, choose the main menu : File --> New --> Project.
It will give you several options. First you must select a type from the left side of the popup - you must choose Visual Basic Projects or Visual C# projects based on the language you plan to use for development. But here I am using C# so you should choose Visual C# projects.
After selecting a type, you choose a template from the right side. You may choose Windows Application, ASP.NET Web Application or any other template based on the nature of the application you want. I have used ConsoleApplication template for this tutorial so you also select ConsoleApplication template type.
When you create a ConsoleApplication template project, VS.NET will add a sample file by default. You can simply Build your new project.
Step 2 : Add a reference to Aspose.Words Assembly in Project
The Add Reference dialog box can be used to add project references. This dialog box can be accessed from the Project menu.
To add Aspose.Words project reference
- In Solution Explorer, select the project.
- On the Project menu, choose Add Reference.
The Add Reference dialog box opens.
- Select the tab indicating the Aspose.Words component in .Net pane.
- Click OK when you have selected the component of Aspose.Words.
Selected reference of Aspose.Words will now appear under the References node of the project
Step 3: Open an existing DOC document
To open a document the Aspose.Words library contains a Document class that is central to the library. This Document object allows loading documents in many formats. The file format I had to read was Microsoft Office DOC format. I passed the filename concatenated with the file path into the constructor of the Document object using a String var ImageFilePath . I had to add following line of code to read the file.
//open an existing DOC document using the Document object class
string ImageFilePath = "c:\\imagefolder";
Document doc = new Document(ImageFilePath + \\ImageFile.doc);
Step 4 :Access to Images in Document
Now I had to access the images contained in the doc object. The Document object follows Microsoft DOM model so accessing the images in the document was fairly easy by getting the collection of Nodes from Document tree calling the GetChildNodes method and asking it to provide the nodes of shape type. The class NodeCollection is represents a collection of nodes of a specific type.
//It gives a collection of all shape nodes in the tree
NodeCollection shapes = doc.GetChildNodes(NodeType.Shape, true, false);
Step 5: Iterate through Node Collections
Now I had to iterate through the node collections array. Here is the code for doing it.
int imageIndex = 0;
foreach (Shape shape in shapes)
String name = "DocumentImage" + "_" + imageIndex.ToString() + ".bmp";
shape.ImageData.Save(ImageFilePath +"\\"+ name);