Spire.Doc is a professional Word .NET library specifically designed for developers to create, read, write, convert and print Word document files. Get free and professional technical support for Spire.Doc for .NET, Java, Android, C++, Python.

Mon Aug 09, 2021 9:31 am

Hi Team,

I am not able to extract the table content from the attached document file. System is escaping the table while parsing the content line by line.

Please have a look and advice.

Thanks in advance.

pr20080798
 
Posts: 159
Joined: Wed Jan 20, 2021 1:15 pm

Mon Aug 09, 2021 10:30 am

Hello,

Thanks for your inquiry.
Please refer to the following code to extract text from the table in your Word file.
Code: Select all
            string text=null;
            Document doc = new Document();
            doc.LoadFromFile(@"test_spire.docx");
            Body body = doc.Sections[0].Body;
            foreach (DocumentObject child in body.ChildObjects)
            {
                foreach (DocumentObject child2 in child.ChildObjects)
                {
                    if (child2.DocumentObjectType.Equals(DocumentObjectType.Table))
                    {
                        Table table = child2 as Table;
                        foreach (TableRow tableRow in table.Rows)
                        {
                            foreach (TableCell headerCell in tableRow.Cells)
                            {
                                foreach (DocumentObject documentObject in headerCell.ChildObjects)
                                {
                                    if (documentObject is Paragraph)
                                    {
                                        Paragraph paragraph = documentObject as Paragraph;
                                        text += paragraph.Text + "\t";
                                    }
                                }
                            }
                            text += "\n";
                        }
                    }
                }

            }
            File.WriteAllText("Extract.txt", text);


Sincerely,
Brian
E-iceblue support team
User avatar

Brian.Li
 
Posts: 1271
Joined: Mon Oct 19, 2020 3:04 am

Tue Aug 10, 2021 3:42 am

Thanks Brian for your response.

The given code works for the document, but I would like to know how's the specified table come inside the paragraph children. Generally, we get the table parallel to the paragraph like I get in other word document. Is there any specific way to create the table inside paragraph in MS Word?

pr20080798
 
Posts: 159
Joined: Wed Jan 20, 2021 1:15 pm

Tue Aug 10, 2021 11:16 am

Hello,

Thanks for your feedback.
For the document you provide, the table in your document is in the structure document tag. And the structure document tag is the child object of body (body.ChildObjects[0]). The table is not in a paragraph (as shown below).
screenshot.png


Besides, since tables and paragraphs in Word documents are objects of the same level, and I am sorry that there is no way to create a table inside a paragraph.

Sincerely,
Brian
E-iceblue support team
User avatar

Brian.Li
 
Posts: 1271
Joined: Mon Oct 19, 2020 3:04 am

Return to Spire.Doc

cron