Spire.Doc is a professional Word .NET library specifically designed for developers to create, read, write, convert and print Word document files. Get free and professional technical support for Spire.Doc for .NET, Java, Android, C++, Python.

Thu Mar 07, 2019 10:48 pm

Hi
I have a word document with TOC, Where i would like to get the Content of the section(along with text styles) by Section Title or id?

Could you please help on this.

For example: From the Attached document i want to get the entire content(with Text styles as in word document, in html format) of the particular section by its name.
Means get content of the section of (1 Title)
Thank you

portal.support@minterellison.co.nz
 
Posts: 8
Joined: Mon Feb 25, 2019 8:13 pm

Fri Mar 08, 2019 2:16 am

Hello,

Thanks for your inquiry.
I didn't find your attached document in this post, could you please share it again? Or you could send it to our email (support@e-iceblue.com). So that we can look into your case better and provide a solution accordingly. Thanks in advance.

Sincerely,
Nina
E-iceblue support team
User avatar

Nina.Tang
 
Posts: 1187
Joined: Tue Sep 27, 2016 1:06 am

Fri Mar 08, 2019 3:05 am

Hi
Please find attached file.
What i am looking for is: Based on the TOC Title(Which i know), Get all the content for that section along with Font styles(rich text) in order to store into my sql database. Thank you.

portal.support@minterellison.co.nz
 
Posts: 8
Joined: Mon Feb 25, 2019 8:13 pm

Fri Mar 08, 2019 3:24 am

Hi Nina,

To be clear on my needs from the attached document:
1. I should be able to look into the document with TOC title: 'Earnings as an employee: payments to spouse or partner'
2. Should get the entire Content of this section with the same data format how it is in the document on Page 37

Input : 'Earnings as an employee: payments to spouse or partner'
Expected Output:
(1) Earnings as an employee, in relation to any person and any tax year, does not include any amount paid to the person (person A) for services he or she per-forms for his or her spouse or partner (person B), as person B’s employee or otherwise.

(2) However, subsection (1) does not apply if person B, in order to calculate his or her income for the purposes of the Income Tax Act 2007, has made a written application for, and obtained, the Commissioner’s consent to a deduction being made for any amounts paid by person B to person A for the services person A performs.

(3) If subsection (2) applies, account must be taken of the following in determining person A’s weekly earnings for as long as the Commissioner’s consent relates to the services and to the amounts paid:

(a) the services performed by person A after the date on which the Commis-sioner receives person B’s application; and

(b) any amounts paid after the date on which the Commissioner receives person B’s application.

(4) The Corporation may accept that there has been sufficient compliance with subsection (2), and levies are payable accordingly, if—

(a) person A provides services to person B; and

(b) person B submits or has submitted a return of income to the Commis-sioner; and

(c) person B shows the amounts paid to person A for such services in the return as an expense incurred in the production of income for the pur-poses of the Income Tax Act 2007; and

(d) person A includes the amounts paid to him or her by person B for such services in a return of income submitted to the Commissioner; and

(e) person A pays or has paid tax (if appropriate) on such amounts.

portal.support@minterellison.co.nz
 
Posts: 8
Joined: Mon Feb 25, 2019 8:13 pm

Fri Mar 08, 2019 10:33 am

Hi,

Please refer to the following code to get the content you want.
Code: Select all
 static void Main(string[] args)
 {
     //Load Word document
     Document doc = new Document();
     doc.LoadFromFile(@"Accident Compensation Act 2001.docx");
     //Create a new Word document
     Document newdoc = new Document();
     Section newsec = newdoc.AddSection();
     //Find the input string "Earnings as an employee: payments to spouse or partner"
     TextSelection[] selections = doc.FindAllString("Earnings as an employee: payments to spouse or partner", false, true);
     foreach (TextSelection selection in selections)
     {
         TextRange tr = selection.GetAsOneRange();
         Paragraph para = tr.OwnerParagraph as Paragraph;
         if (para.ChildObjects.FirstItem.DocumentObjectType == DocumentObjectType.Field)
         {
             continue;
         }
         Section section = para.Owner.Owner as Section;
         //Get the section index and paragraph index
         int indexP = section.Body.ChildObjects.IndexOf(para);
         int indexS=doc.Sections.IndexOf(section);
         //Append the paragraph in the new word
         newsec.Body.ChildObjects.Add(para.Clone());
         //Loop sections and paragraphs and clone the paragraphs you want to the new word
         for (int s = indexS; s < doc.Sections.Count; s++)
         {
             int num;
             if (s == indexS)
             {
                 num = indexP + 1;
             }
             else
             {
                 num = 0;
             }
             if(FinishCloning(doc,s,num,newsec))
             {
                 break;
             }
         }
     }
     //Save the file
     newdoc.SaveToFile("result.docx", FileFormat.Docx2013);
 }
 private static bool FinishCloning(Document doc, int s, int num, Section newsec)
 {
     for (int i = num; i < doc.Sections[s].Body.ChildObjects.Count; i++)
     {
         if (doc.Sections[s].Body.ChildObjects[i] is Paragraph)
         {
             Paragraph p = doc.Sections[s].Body.ChildObjects[i] as Paragraph;
             //If the paragraph has ListType, clone it in the new word
             if (p.ListFormat.ListType != ListType.NoList)
             {
                 newsec.Body.ChildObjects.Add(p.Clone());
             }
             else if (p.Text != ""&&p.Text!="37")
             {
                 return true;
             }
         }
     }
     return false; 
 }   

Sincerely,
Nina
E-iceblue support team
User avatar

Nina.Tang
 
Posts: 1187
Joined: Tue Sep 27, 2016 1:06 am

Fri Mar 08, 2019 10:16 pm

Hi Nina,
Thank you for your reply.
Can i check if is possible to get the Cloned Text to convert into HTML raw data(which gives my styles/fonts/line breaks etc):
For example:
1.Text in Word document --

(1) Earnings as an employee, in relation to any person and any tax year, does not include any amount paid to the person (person A) for services he or she per-forms for his or her spouse or partner (person B), as person B’s employee or otherwise.

(2) However, subsection (1) does not apply if person B, in order to calculate his or her income for the purposes of the Income Tax Act 2007, has made a written application for, and obtained, the Commissioner’s consent to a deduction being made for any amounts paid by person B to person A for the services person A performs.

Expected Out Put As:
<ul>
<li style="text-align: justify; text-justify: inter-ideograph; text-indent: -29.8pt; line-height: 109%; tab-stops: 57.0pt; margin: 0cm 27.3pt .0001pt 57.0pt;"><strong><span style="font-size: 11.5pt; line-height: 109%;">Earnings as an employee</span></strong><span style="font-size: 11.5pt; line-height: 109%;">, in relation to any person <a href="http://www.google.co.nz">and any tax year</a>, does not include any amount paid to the person (<strong>person A</strong>) for services he or she per-forms for his or her spouse or partner (<strong>person B</strong>), as person B&rsquo;s employee or otherwise.</span></li>
</ul>
<p style="line-height: 1.65pt;"><span style="font-size: 11.5pt;">&nbsp;</span></p>
<ul>
<li style="text-align: justify; text-justify: inter-ideograph; text-indent: -29.8pt; line-height: 107%; tab-stops: 57.0pt; margin: 0cm 27.3pt .0001pt 57.0pt;"><span style="font-size: 11.5pt; line-height: 107%;">However, subsection (1) does not apply if person B, in order to calculate his or her income for the purposes of the <a href="http://prd-lgnz-nlb.prd.pco.net.nz/pdflink.aspx?id=DLM1512300"><span style="color: windowtext; text-decoration: none; text-underline: none;">Income Tax Act 2007, </span></a>has made a written application for, and obtained, the Commissioner&rsquo;s consent to a deduction being made for any amounts paid by person B to person A for the services person A performs.</span></li>
</ul>

portal.support@minterellison.co.nz
 
Posts: 8
Joined: Mon Feb 25, 2019 8:13 pm

Fri Mar 08, 2019 10:34 pm

To add back ground info that i need to store the Section content along with styles in Database table and populate it in my web application with same styles as it was in word document. Thank you.

portal.support@minterellison.co.nz
 
Posts: 8
Joined: Mon Feb 25, 2019 8:13 pm

Mon Mar 11, 2019 12:07 am

Hi
I would like to get the list of TOC in a word document and loop through them to get the body/content of each one.

Could you please guide on this.

For example: Please find attached document: I have TOC up to 401. What i need is get the body/content of each item in TOC.

Thank you

portal.support@minterellison.co.nz
 
Posts: 8
Joined: Mon Feb 25, 2019 8:13 pm

Mon Mar 11, 2019 10:09 am

Hello,

Thanks for your feedback and sorry for late reply as weekend.
The contents of TOC are included in bookmarks, we could get these contents by bookmarks. However, in addition to these bookmarks, your document also contains other bookmarks. There is no good solution to distinguish them, according to your document, we could know the bookmark names that contain these contents you want are from "page19" ~"page258". You could refer to the following code to get contents. Besides, our Spire.Doc supports converting Docx to Html, but after converting, the items are included in <p> tag but not <ul><li> tag. Sorry at present there is no good solution to achieve it. If there is any question, welcome to write back.
Code: Select all
//Load Word document
Document doc = new Document();
doc.LoadFromFile(@"Accident Compensation Act 2001.docx");
Document newdoc = new Document();
Section newsec = newdoc.AddSection();
List<Bookmark> lists = new List<Bookmark>();
for (int i = 19; i <=258; i++)
{
    foreach (Bookmark bm in doc.Bookmarks)
    {
        if (bm.Name == "page" + i.ToString())
        {
            lists.Add(bm);
        }
    }
}
foreach(Bookmark bookmark in lists)
{
    Section section = bookmark.BookmarkStart.OwnerParagraph.Owner.Owner as Section;
    int index = section.Body.ChildObjects.IndexOf(bookmark.BookmarkStart.OwnerParagraph);
    for (int i = index; i < section.Body.ChildObjects.Count; i++)
    {
        if (section.Body.ChildObjects[i] is Paragraph)
        {
            newsec.Body.ChildObjects.Add(section.Body.ChildObjects[i].Clone());
        }
    }
}
newdoc.SaveToFile("result.docx", FileFormat.Docx2013);
//Save to html
newdoc.SaveToFile("result.html", FileFormat.Html);

Sincerely,
Nina
E-iceblue support team
User avatar

Nina.Tang
 
Posts: 1187
Joined: Tue Sep 27, 2016 1:06 am

Return to Spire.Doc