Udai Mathur

Udai Mathur

  • NA
  • 49
  • 10.4k

Read Word File Content and its Form Fileds value + Interop

Jul 3 2019 4:08 AM
I need to read the Word file content and based on some specific requirnment I need to insert the content of Word file into sql server DB.

I am using Microsoft.Office.Interop.Word dll. I have updated it through manage nuget packages from my visual studio (IDE).

It reads well, when it found the entire line in word. Problems comes when it found any form control inside table or paragraph.

How can I read the form control value (Checkbox, Textbox, Dropdown, datetpicker) inside paragraph or Table.

Also I need find out a way where I can read the word file like.

Paragraph1
Paragraph2
Table1
Table2
Paragraph1

After getting into paragraph or table I can read the specific control value if exists else read simple string.
 
Currently I am using below code:

StringBuilder text = new StringBuilder();
List<Range> TablesRanges = new List<Range>();
string celltext = string.Empty;

Microsoft.Office.Interop.Word.Application wordApp = new Application();
object file = @"D:\Test.docx";

object nullobj = System.Reflection.Missing.Value;

Microsoft.Office.Interop.Word.Document doc = wordApp.Documents.Open(
ref file, ref nullobj, ref nullobj,
                                      ref nullobj, ref nullobj, ref nullobj,
                                      ref nullobj, ref nullobj, ref nullobj,
                                      ref nullobj, ref nullobj, ref nullobj);

text.Append("Paragraph Start\n");
foreach (Microsoft.Office.Interop.Word.Paragraph paragraph in doc.Paragraphs)
{
    if (!string.IsNullOrEmpty(paragraph.Range.Text) && !paragraph.Range.Text.Contains("\r\a"))
    {
        string toWrite = paragraph.Range.Text;
        text.Append(toWrite.Replace('\r', ' ').Replace('\n', ' ').Replace('\a', ' ') + " \n"); 

        foreach (Field field in doc.Fields)
        {
           var j=  field.Result.ContentControls.GetType();
           
            //Here I want it check the control type first then get its value
            if (field.Result != null)
            {
                var m = field.Result.Text;
            }

        }
    }
}
text.Append("Paragraph End\n");

text.Append("Table Start\n");
foreach (Microsoft.Office.Interop.Word.Table tb in doc.Tables)
{
    for (int row = 1; row <= tb.Rows.Count; row++)
    {
        for (int index = 1; index <= 20; index++)
        {
            try
            {                     
                var cell = tb.Cell(row, index);
              
                celltext += cell.Range.Text + " ";
            }
            catch (Exception ex)
            {
                break;
            }
        }

        text.Append(celltext.Replace('\r', ' ').Replace('\n', ' ').Replace('\a', ' ') + " \n");
        celltext = "";               
    }

    text.Append("\n");
}
text.Append("Table End");

ReadContent(text.ToString());
text.Clear();