News Category

C#/VB.NET: Split Word Documents

2022-10-27 08:57:00 Written by  support iceblue
Rate this item
(0 votes)

In MS Word, you can split a document by manually cutting the content from the original document and pasting it into a new document. Although the task is simple, it can also be quite tedious and time-consuming especially when dealing with a long document. This article will demonstrate how to programmatically split a Word document into multiple files using Spire.Doc for .NET .

Install Spire.Doc for .NET

To begin with, you need to add the DLL files included in the Spire.Doc for.NET package as references in your .NET project. The DLL files can be either downloaded from this link or installed via NuGet.

PM> Install-Package Spire.Doc

Split a Word Document by Page Break

A Word document can contain multiple pages separated by page breaks. To split a Word document by page break, you can refer to the below steps and code.

  • Create a Document instance.
  • Load a sample Word document using Document.LoadFromFile() method.
  • Create a new Word document and add a section to it.
  • Traverse through all body child objects of each section in the original document and determine whether the child object is a paragraph or a table.
  • If the child object of the section is a table, directly add it to the section of new document using Section.Body.ChildObjects.Add() method.
  • If the child object of the section is a paragraph, first add the paragraph object to the section of the new document. Then traverse through all child objects of the paragraph and determine whether the child object is a page break.
  • If the child object of the paragraph is a page break, get its index and then remove the page break from its paragraph by index.
  • Save the new Word document and then repeat the above processes.
  • C#
  • VB.NET
using System;
using Spire.Doc;
using Spire.Doc.Documents;

namespace SplitByPageBreak
{
    class Program
    {
        static void Main(string[] args)
        {
            //Create a Document instance
            Document original = new Document();

            //Load a sample Word document
            original.LoadFromFile(@"E:\Files\SplitByPageBreak.docx");

            //Create a new Word document and add a section to it
            Document newWord = new Document();
            Section section = newWord.AddSection();
            int index = 0;

            //Traverse through all sections of the original document
            foreach (Section sec in original.Sections)
            {
                //Traverse through all body child objects of each section
                foreach (DocumentObject obj in sec.Body.ChildObjects)
                {
                    if (obj is Paragraph)
                    {
                        Paragraph para = obj as Paragraph;
                        sec.CloneSectionPropertiesTo(section);

                        //Add paragraph object in the section of original document into section of new document
                        section.Body.ChildObjects.Add(para.Clone());

                        //Traverse through all child objects of each paragraph and determine whether the object is a page break
                        foreach (DocumentObject parobj in para.ChildObjects)
                        {
                            if (parobj is Break && (parobj as Break).BreakType == BreakType.PageBreak)
                            {
                                //Get the index of page break in paragraph
                                int i = para.ChildObjects.IndexOf(parobj);

                                //Remove the page break from its paragraph
                                section.Body.LastParagraph.ChildObjects.RemoveAt(i);

                                //Save the new Word document
                                newWord.SaveToFile(String.Format("result\out-{0}.docx", index), FileFormat.Docx);
                                index++;

                                //Create a new document and add a section
                                newWord = new Document();
                                section = newWord.AddSection();

                                //Add paragraph object in original section into section of new document
                                section.Body.ChildObjects.Add(para.Clone());
                                if (section.Paragraphs[0].ChildObjects.Count == 0)
                                {
                                    //Remove the first blank paragraph
                                    section.Body.ChildObjects.RemoveAt(0);
                                }
                                else
                                {
                                    //Remove the child objects before the page break
                                    while (i >= 0)
                                    {
                                        section.Paragraphs[0].ChildObjects.RemoveAt(i);
                                        i--;
                                    }
                                }
                            }
                        }
                    }
                    if (obj is Table)
                    {
                        //Add table object in original section into section of new document
                        section.Body.ChildObjects.Add(obj.Clone());
                    }
                }
            }

            //Save to file
            newWord.SaveToFile(String.Format("result/out-{0}.docx", index), FileFormat.Docx);

        }
    }
}

C#/VB.NET: Split Word Documents

Split a Word Document by Section Break

In Word, a section is a part of a document that contains its own page formatting. For documents that contain multiple sections, Spire.Doc for .NET also supports splitting documents by section breaks. The detailed steps are as follows.

  • Create a Document instance.
  • Load a sample Word document using Document.LoadFromFile() method.
  • Define a new Word document object.
  • Traverse through all sections of the original Word document.
  • Clone each section of the original document using Document.Sections.Clone() method.
  • Add the cloned section to the new document as a new section using Document.Sections.Add() method.
  • Save the result document using Document.SaveToFile() method.
  • C#
  • VB.NET
using System;
using Spire.Doc;

namespace SplitBySectionBreak
{
    class Program
    {
        static void Main(string[] args)
        {
            //Create a Document instance
            Document document = new Document();

            //Load a sample Word document
            document.LoadFromFile(@"E:\Files\SplitBySectionBreak.docx");

            //Define a new Word document object
            Document newWord;

            //Traverse through all sections of the original Word document
            for (int i = 0; i < document.Sections.Count; i++)
            {
                newWord = new Document();

                //Clone each section of the original document and add it to the new document as new section
                newWord.Sections.Add(document.Sections[i].Clone());

                //Save the result document 
                newWord.SaveToFile(String.Format(@"test\out_{0}.docx", i));
            }
        }
    }
}

C#/VB.NET: Split Word Documents

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Additional Info

  • tutorial_title:
Last modified on Wednesday, 31 May 2023 08:58