Spire.Doc is a professional Word .NET library specifically designed for developers to create, read, write, convert and print Word document files. Get free and professional technical support for Spire.Doc for .NET, Java, Android, C++, Python.

Mon Nov 15, 2021 2:53 pm

Hello,

One of our .NET application has evolved and has a new requirement where we need to extract certain pages from a document and merge it into another. As far as I can tell the functionality of the Spire.Doc namespace in regards to extracting pages and merging them is not there like it is in Spire.PDF namespace. We tried using the Spire.PDF namespace as it is a dependency of Spire.Doc, however, we get the evaluation warning at the top of the page when we produce a document.

We are a bit concerned we have paid quite considerable amount money for Spire.Doc and may need to purchase another licence for Spire.PDF.
    Can you please confirm if it is possible to achieve the desired functionality/requirement to extract certain pages from a document using Spire.Doc?
    Can you confirm this can then be merged into another documents using Spire.doc.
    Or is this functionality only available using Spire.PDF?

Just to clarify we need extract a page regardless whether it has a page break or the text simply overflows to the next page.

Details:
- Product purchased: Spire.Doc Pro Edition Developer OEM Subscription
- Application type: .NET
- Using Spire.doc version 7.8.12.4040
- Using Spire.PDF version 7.11.1.0

Thank you and kind regards,
Michael Willis

matt.young
 
Posts: 4
Joined: Thu Sep 05, 2019 1:32 pm

Tue Nov 16, 2021 2:21 am

Hello Michael,

Thanks for your inquiry!

Kindly note that according to the Microsoft Word standards, the Word document is a fluid typesetting method, and this leads that there is no “page” element in the Word structure. We follow MS Word standards, therefore, our Spire.Doc cannot extract the content by page now, hope you can understand.

We would try to support this feature in future version, but I am afraid we cannot support it in short time. Once this feature is complete, we will inform you ASAP.

As a temporary method, if you do not need to edit the content that extracted by page anymore, but only want to show it in the Word. You can consider that first convert the page you wanted to an image, and then insert the image into the target document, like the following code:

Code: Select all
            //save the first page to image
            Document document1 = new Document(filepath1);
            Image image = document1.SaveToImages(0, ImageType.Bitmap);

            //insert the image to the last paragraph
            Document document2 = new Document(filepath2);
            Paragraph paragraph = document2.LastParagraph;
            DocPicture picture = new DocPicture(document2);
            picture.LoadImage(image);
            picture.TextWrappingStyle = TextWrappingStyle.TopAndBottom;
            paragraph.ChildObjects.Add(picture);

            document2.SaveToFile("afterInsert.docx", FileFormat.Docx);


Sincerely,
Marcia
E-iceblue support team
User avatar

Marcia.Zhou
 
Posts: 858
Joined: Wed Nov 04, 2020 2:29 am

Wed Nov 17, 2021 2:55 pm

Good afternoon Marcia,

Thank you for your response, however, we have decided to purchase Spire.PDF.

We are having another issue where we get a "A generic error occurred in GDI+" when trying to save a document to XPS:

Document doc = new Document(filePath);
doc.SaveToFile(outputPath, Spire.Doc.FileFormat.XPS);

Context
We are loading a word (docx) document and a XPS document. We then convert the XPS to docx and then merged the two documents together. Finally, we are trying to save the merged result as a XPS document but unfortunately we get the A generic error occurred in GDI+ exception. We use Novacode for the merge. It is only saving the file as XPS that is causing the issue.

Code example

Code: Select all
const string folderLocation = @"C:\temp\";

PdfDocument xpsFile = new PdfDocument();
xpsFile.LoadFromXPS($"{folderLocation}xpsfile.xps");
DocX wordDocument = DocX.Load($"{folderLocation}wordfile.docx");

using (MemoryStream stream = new MemoryStream())
{
      xpsFile.SaveToStream(stream, Spire.Pdf.FileFormat.DOCX);
      DocX coverted = DocX.Load(stream);
      coverted.InsertDocument(wordDocument);

      using (MemoryStream docStream = new MemoryStream())
      {
            coverted.SaveAs(docStream);
            Document xps = new Document(docStream);
            xps.SaveToFile($"{folderLocation}Merged.docx", Spire.Doc.FileFormat.Docx);
      }
}

Document doc = new Document($"{folderLocation}Merged.docx");
doc.SaveToFile($"{folderLocation}XPSMerged.xps", Spire.Doc.FileFormat.XPS);


This is show blocker for us and is preventing the signoff of the project.

Thank you and kind regards,
Michael Willis

matt.young
 
Posts: 4
Joined: Thu Sep 05, 2019 1:32 pm

Wed Nov 17, 2021 5:24 pm

Good evening,

Just to add to my previous post, this error does not seem consistent and we do seem to convert and save an XPS file sometimes. However, it seems to affect production considerably and seems to fail more than it succeeds.

The code I provided may need to run a few time before you get the error.

We are using windows 10.

Thank you and kind regards,
Michael Willis

matt.young
 
Posts: 4
Joined: Thu Sep 05, 2019 1:32 pm

Thu Nov 18, 2021 3:16 am

Hello Michael,

Thanks for choosing our Spire.PDF.

According to your code, you need to combine the XPS file and the Word File to one XPS file, right? But in your code, there was another product is used to combine the files, and I am not sure that if the process of generated "Merged.docx" caused your issue, due to we do not know much about the DocX product.

Actually, you can use our Spire.Office for Net in your project, and then you can achieve your needs through Spire.Doc and Spire.PDF in the Spire.Office like the following code shows. Besides, I have tested the code with more than 40 times, but I still did not encore the “GDI+ exception” issue.
Code: Select all
            string xpsFilePath = @"E:\\testdoc\\sample.xps";
            string wordFilePath = @"E:\\testdoc\\doc2.docx";
            //kindly note that if you want to use more than two spire products in one program, please use spire.office
            //load the xps file with spire.pdf
            PdfDocument xpsFile = new PdfDocument();
            xpsFile.LoadFromXPS(xpsFilePath);
            using (MemoryStream stream = new MemoryStream())
            {
                //save the xps file to word stream, and then load the stream by spire.doc
                xpsFile.SaveToStream(stream, Spire.Pdf.FileFormat.DOCX);
                Document xps2WordFile = new Document(stream, Spire.Doc.FileFormat.Docx);
                //combine the xpsfile and the wordfile with InsertTextFromFile method.
                xps2WordFile.InsertTextFromFile(wordFilePath, Spire.Doc.FileFormat.Docx);
                //save the result file as Xps format
                xps2WordFile.SaveToFile("afterMerge.Xps", Spire.Doc.FileFormat.XPS);
            }

Please try this code on your side. And if the issue still exists, please provide us with your input files (XPS and Word) for further investigation. Thanks in advance.

Sincerely,
Marcia
E-iceblue support team
User avatar

Marcia.Zhou
 
Posts: 858
Joined: Wed Nov 04, 2020 2:29 am

Thu Nov 18, 2021 6:25 pm

Good afternoon Marcia,

Thank you for your advice, it is really appreciated. However, we are still encountering an issue with the Save to file method. Today we have got a mixture of generic GDI+ errors and null reference exceptions.

Generic GDI+
System.Runtime.InteropServices.ExternalException (0x80004005): A generic error occurred in GDI+.
at System.Drawing.Graphics.MeasureCharacterRanges(String text, Font font, RectangleF layoutRect, StringFormat stringFormat)
at sprᬊ.ᜀ(String A_0, Font A_1, StringFormat A_2)
at sprᬊ.ᜁ(String A_0, Font A_1, StringFormat A_2)
at sprᬊ.ᜀ(TextRange A_0, TextRange A_1, Paragraph A_2, String A_3)
at sprᬊ.ᜀ(TextRange A_0, String A_1)
at sprស.ᜅ(sprឞ A_0)
at sprស.ᜁ(RectangleF A_0)
at sprហ.ᜂ(sprឞ A_0)
at sprស.ᜁ(RectangleF A_0)
at sprស.ᜋ(sprឞ A_0)
at sprស.ᜁ(RectangleF A_0)
at sprស.ᜋ(sprឞ A_0)
at sprស.ᜁ(RectangleF A_0)
at sprន.ᜀ(sprឋ A_0, sprច A_1, sprᬊ A_2)
at sprᬂ.ᜒ()
at sprᬂ.ᜓ()
at sprᬂ.ᜀ(IDocument A_0)
at sprᬱ.ᜀ(Document A_0, Stream A_1)
at Spire.Doc.Document.ᜄ(String A_0)
at Spire.Doc.Document.SaveToFile(String fileName, FileFormat fileFormat)

Null reference
System.NullReferenceException: Object reference not set to an instance of an object.
at spr?.?(String A_0, String A_1, Single A_2, FontStyle A_3)
at spr?.?(String A_0, Single A_1, FontStyle A_2, CharacterFormat A_3)
at spr?.?(TextRange A_0, spr? A_1, RectangleF& A_2, Boolean A_3, spr? A_4)
at spr?.?(TextRange A_0, IDocumentObject A_1, Paragraph A_2, String A_3, Boolean A_4)
at spr?.?(spr? A_0)
at spr?.?(RectangleF A_0)
at spr?.?(spr? A_0)
at spr?.?(RectangleF A_0)
at spr?.?(spr? A_0)
at spr?.?(RectangleF A_0)
at spr?.?(spr? A_0)
at spr?.?(RectangleF A_0)
at spr?.?(spr? A_0, spr? A_1, spr? A_2)
at spr?.?()
at spr?.?()
at spr?.?(IDocument A_0)
at spr?.?(Document A_0, Stream A_1)
at spr?.?(Document A_0, Stream A_1, ToPdfParameterList A_2)
at Spire.Doc.Document.?(String A_0)
at Spire.Doc.Document.SaveToFile(String fileName, FileFormat fileFormat)

The errors are still intermittent, so to demonstrate our issue we have created a test example that simply loops through, reads a word file, and then converts the XPS.

Code: Select all
const string path = @"C:\PrintTest\";
const int runs = 100;

/*
* The save code is in a loop as the error is intermittent (occurs sometimes)
*/
for (int runNumber = 1; runNumber <= runs; runNumber++)
{
      using (Document doc = new Document($"{path}Merged.docx"))
      {
            //Error is thrown here
            doc.SaveToFile($"{path}Converted.xps", FileFormat.XPS);

            //The following line works without issue (saving as Docx)
            //doc.SaveToFile($"{path}Converted.docx", FileFormat.Docx);
      }
}


In our actual application we will be reading and dealing with multiple files.

I have attached the files, evidence, projects and along with this post. However, I have been unable to upload the main word file as it is too big. Is there another way you can retrieve this?

Thank you and kind regards,
Michael Willis

matt.young
 
Posts: 4
Joined: Thu Sep 05, 2019 1:32 pm

Fri Nov 19, 2021 5:43 am

Hello Michael,

Thanks for sharing more information!

I have tested with your code, but still cannot reproduce the issues.

To help us better investigate your issues on our end, please share with us the following information, you can send it to us through e-mail(support@e-iceble) or Skype(iceblue-support). Thanks in advance.
1. The full code and input files you were using, better to provide a sample project so that we could run it and reproduce your issue directly.
2. Your system information (E.g. Win7, 64 bit) and region setting (E.g. China, Chinese)
3. The target framework you were using (E.g. .net framework 4.6.0).

Sincerely,
Marcia
E-iceblue support team
User avatar

Marcia.Zhou
 
Posts: 858
Joined: Wed Nov 04, 2020 2:29 am

Fri Nov 19, 2021 10:02 am

Hello Michael,

I have received your email, and thanks for sharing more information!

I have reproduced the issue of “NullReferenceException” and logged it in our issue tracking system with the ticket SPIREDOC- 6990 for further investigation.

We will let you know if there is any update. Sorry for the inconvenience caused.

Sincerely,
Marcia
E-iceblue support team
User avatar

Marcia.Zhou
 
Posts: 858
Joined: Wed Nov 04, 2020 2:29 am

Wed Dec 15, 2021 3:19 am

Hi Michael,

Thanks for your patience!

Glad to inform you that we just released Spire.Doc Pack(Hot Fix) Version:9.12.3 which fixes your issue of SPIREDOC-6990.

Please download the fix version from the following links to test.
Website link:
https://www.e-iceblue.com/Download/download-word-for-net-now.html
Nuget link:
https://www.nuget.org/packages/Spire.Doc/9.12.3

Sincerely,
Marcia
E-iceblue support team
User avatar

Marcia.Zhou
 
Posts: 858
Joined: Wed Nov 04, 2020 2:29 am

Wed Dec 22, 2021 9:01 am

Hello Michael,

Hope you are doing well!

Has the issue been solved now? Could you please give us some feedback at your convenience?

Thanks in advance.

Sincerely,
Marcia
E-iceblue support team
User avatar

Marcia.Zhou
 
Posts: 858
Joined: Wed Nov 04, 2020 2:29 am

Tue May 16, 2023 6:06 am

Hi,

Thanks for your patience.
Glad to inform you that we just released Spire.Doc 11.5.6 hotfix, which has supported manipulating pages, such as retrieving page content and its coordinates. Please see the following code for reference.
Code: Select all
Document doc = new Document();
doc.LoadFromFile(inputFile, FileFormat.Docx);
FixedLayoutDocument layoutDoc = new FixedLayoutDocument(doc);

// Access to the line of the first page and print to the console.
FixedLayoutLine line = layoutDoc.Pages[0].Columns[0].Lines[0];

StringBuilder stringBuilder = new StringBuilder();
stringBuilder.AppendLine("Line: " + line.Text);

// With a rendered line, the original paragraph in the document object model can be returned.
Paragraph para = line.Paragraph;
stringBuilder.AppendLine("Paragraph text: " + para.Text);

// Retrieve all the text that appears on the first page in plain text format (including headers and footers).
string pageText = layoutDoc.Pages[0].Text;
stringBuilder.AppendLine(pageText);

// Loop through each page in the document and print the count of the lines appear on each page.
foreach (FixedLayoutPage page in layoutDoc.Pages)
{
    LayoutCollection lines = page.GetChildEntities(LayoutElementType.Line, true);
    stringBuilder.AppendLine("Page " + page.PageIndex + " has " + lines.Count + " lines.");
}

// This method provides a reverse lookup of layout entities for any given node
// (except runs and nodes in the header and footer).
stringBuilder.AppendLine("The lines of the first paragraph:");
foreach (FixedLayoutLine paragraphLine in layoutDoc.GetLayoutEntitiesOfNode(
    ((Section)doc.FirstChild).Body.Paragraphs[0]))
{
    stringBuilder.AppendLine(paragraphLine.Text.Trim());
    stringBuilder.AppendLine(paragraphLine.Rectangle.ToString());
}
File.WriteAllText("page.txt", stringBuilder.ToString());

Please refer to our official website for more updated content introduction: https://www.e-iceblue.com/news/spire-doc/Spire.Doc-11.5.6-supports-adding-charts.html
Download links:
Website: https://www.e-iceblue.com/Download/download-word-for-net-now.html
Nuget: https://www.nuget.org/packages/Spire.Doc/11.5.6

Best regards,
Triste
E-iceblue support team
User avatar

Triste.Dai
 
Posts: 999
Joined: Tue Nov 15, 2022 3:59 am

Return to Spire.Doc