Problem while finding text on pdf document file

Tue May 25, 2021 11:36 am

Hello,

We encountered a problem while finding text on pdf document file. When calling FindAllText() method on document page, text is recognized incorrectly. As result, we receive strange MatchText and incorrect Finds collection, e.g.

attach2.png

There are three fonts embedded in this file:

attach1.png

Code used for testing:

Code: Select all: using (var pdfStream = new FileStream(@"test.pdf", FileMode.Open)) { PdfDocument document = new PdfDocument(); document.LoadFromStream(pdfStream); var page = document.Pages[0]; var finds = page.FindAllText().Finds; }

We use the 7.5.0 commercial version of Spire.Pdf. The pdf file has been added as an attachment.
Thanks in advance.

Best regards

Wed May 26, 2021 7:16 am

Hello,

Thank you for your inquiry.
I tested your case and found the garbled text issue you mentioned, I have posted this issue with the ticket number SPIREPDF-4327 to our dev team for investigation and fixing. Besides, I found the PDF file you provided is a little special, it has some text in an invisible area, as shown in the figure below. When calling our FindAllText() method, it will return all text including the invisible area's text. So I want to confirm with you if you want to find all text including the invisible area's text or not. I am looking forward to your reply. Thanks in advance.

Sincerely,
Annika
E-iceblue support team

Wed May 26, 2021 9:23 am

Hello,

Thank you for such a quick reply.
I used a tool to cut pdf in order to hide personal data. So sorry for misunderstanding and I would be grateful if you could remove/hide attachment from your post.
Ultimately this hidden text will be also included in the task of finding text on this pdf file.
Thanks in advance.

Best regards

Wed May 26, 2021 10:30 am

Hello,

Thank you for your feedback.
I've deleted the attachment from my previous post. Our Dev team will investigate and figure out the garbled text issue. Extend according to your situation, we will consider providing a parameter that allows user to choose whether or not to get text from invisible area. This will take into account the needs of more customers. If your issue has been fixed, I will infrom you immediately.

Sincerely,
Annika
E-iceblue support team

Tue Jun 01, 2021 4:24 am

Hello,

Great to hear that:) Is there any update about the issue?
Thanks in advance.

Best regards

Tue Jun 01, 2021 6:21 am

Hello,

Thank you for your inquiry.
This issue has been resolved and it is going to test phase now. If the test goes well, we will provide a hotfix for you as soon as possible. Please give us more time, thanks for your understanding.

Sincerely,
Annika
E-iceblue support team

Fri Jun 11, 2021 11:26 am

Hi,

do you have any updates about hot fix?

Best regards,
Adrian Miraszewski

Mon Jun 14, 2021 2:30 am

Dear Adrian,

Sorry for the late reply as weekend.
We met a problem when test the hotfix of your issue, our dev team is still working on this issue. I have urged our Dev team again, please spare us more time. I will keep you informed once the issue is resolved thoroughly. Thanks for your understanding.

Sincerely,
Nina
E-iceblue support team

Thu Jul 08, 2021 11:23 am

Hi,

how long do you have to prepare the fix? We're waiting over one month for it.

Best regards,
Adrian Miraszewski

Fri Jul 09, 2021 2:30 am

Hello,

Thank you for your follow-up.
Considering your situation, we compiled a temporary version for you. I have confirmed your issue has been resolved in this version. Welcome to download it. If there is still any question, please feel free to write back.

Sincerely,
Annika
E-iceblue support team

Thu Jul 22, 2021 10:26 am

Hello,

Glad to inform you that we have just released the official version (Spire.PDF Pack (Hot Fix) Version: 7.7.10)，please download it from the following links.
Website link:https://www.e-iceblue.com/Download/download-pdf-for-net-now.html
Nuget link: https://www.nuget.org/packages/Spire.PDF.NETCore/7.7.10

Sincerely,
Annika
E-iceblue support team

Mon Aug 02, 2021 7:15 am

Hello,

Hope you are doing well!
Have you tried the new version of Spire.PDF? Is the issue resolved now? Any feedback will be greatly appreciated.

Sincerely,
Annika
E-iceblue support team

Problem while finding text on pdf document file

Purchase

Partnership

Products

Corporation