Spire.PDF is a professional PDF library applied to creating, writing, editing, handling and reading PDF files without any external dependencies. Get free and professional technical support for Spire.PDF for .NET, Java, Android, C++, Python.

Wed Jan 05, 2022 8:16 pm

Hello,

Spire.PDF is probably the solution to our needs, however we have a problem right now.

Our goal is to find text pattern on a PDF document and to replace it by some other values.
When dealing with a PDF wherein the text pattern to seek is on a single line, everything is ok. But if the same text pattern is split on multiple line on the PDF, for instance because of the layout of a cell tab (too small to contain the full text so PDF use CRLF to display the text), in this case it's not working anymore.

Find attached a screenshot of the PDF not ok.
notok.png


I'm using this code :

Code: Select all
page.FindText("ThisIsTheTextThatWeWouldLikeToMatch", TextFindParameter.WholeWord).Finds


I tried with Regex mode and also CrossLine one, still not ok.

It seems that the FindText can only consider one single line...

Can you please tell us if there is a possibility to deal with this case ?

Thanks a lot.

alvesmarc
 
Posts: 8
Joined: Wed Jan 05, 2022 8:00 pm

Thu Jan 06, 2022 2:46 am

Hello,

Thanks for your inquiry!

I created a PDF with the table according to you describe, and I did cannot find the word "ThisIsTheTextThatWeWouldLikeToMatch" with our Spire.PDF as you mentioned.

After further investigate, I found that even in Adobe or Google Chrome, they cannot find the word which break into lines since we all cannot judge it is a word or a sentence when it breaks into lines. Sorry that we cannot support deal with this. Hope you can understand.
find.png


Sincerely,
Marcia
E-iceblue support team
User avatar

Marcia.Zhou
 
Posts: 858
Joined: Wed Nov 04, 2020 2:29 am

Thu Jan 06, 2022 8:10 am

Hi Marcia,

First of all, thanks a lot for your quick reply and the complete test you made.

In fact, I tried either to do a search with Adobe Reader and it works correctly on my side. Please find it on my screenshot.

searchokwithadobe.png


Please consider one important point, the text ThisIsTheTextThatWeWouldLikeToMatch is inserted in a whole unique work in the PDF, it's simply the PDF generation that have a split it with a carriage return because of the size of the cell tab... But the word is a single word, not separated by any space char...

Do you still think that Spire.PDF cannot handle it ?

Thanks again !

alvesmarc
 
Posts: 8
Joined: Wed Jan 05, 2022 8:00 pm

Thu Jan 06, 2022 8:28 am

Hello,

Thanks for your feedback!

To help us better investigate this issue, could you please provide us with the PDF document you are testing with, and the Adobe version you are using? Since in my simulated document, I cannot find the word you point out. Here I also attached my PDF, and the Adobe version I am using. Thanks in advance.
adobe version.png

Sincerely,
Marcia
E-iceblue support team
User avatar

Marcia.Zhou
 
Posts: 858
Joined: Wed Nov 04, 2020 2:29 am

Thu Jan 06, 2022 8:56 am

Sure thing.

Attached my test PDF doc and this is the version of my Acrobat Reader :

Nom.zip


adobeversion.png


Also, I cannot find the text on your document, I really think that is because you text is using an explicit CRLF...

thx

alvesmarc
 
Posts: 8
Joined: Wed Jan 05, 2022 8:00 pm

Thu Jan 06, 2022 9:53 am

Hello,

Thanks for sharing more information!

I did reproduce your issue with the document you provided, and I have logged it in our issue tracking system with the ticket SPIREPDF-4884 for further investigation.

We will let you know if there is any update. Sorry for the inconvenience caused.

Sincerely,
Marcia
E-iceblue support team
User avatar

Marcia.Zhou
 
Posts: 858
Joined: Wed Nov 04, 2020 2:29 am

Thu Jan 06, 2022 10:03 am

Hi Marcia,

Thanks again for your help and the impressive reactivity that you guys have !

I looking forward your feedback on it :)

I will be notified directly on this forum thread ? Or do I need to check somewhere else ?

Thx,
Best

alvesmarc
 
Posts: 8
Joined: Wed Jan 05, 2022 8:00 pm

Thu Jan 06, 2022 10:18 am

Hello,

You are welcome!

We will inform you in this post as soon as we solved this issue. Thanks again for your patience!

Sincerely,
Marcia
E-iceblue support team
User avatar

Marcia.Zhou
 
Posts: 858
Joined: Wed Nov 04, 2020 2:29 am

Tue Feb 15, 2022 2:38 pm

Hello,

I am also having this issue with the FindText function

Is it possible to be informed when this is resolved please?

Thanks

pbennett
 
Posts: 1
Joined: Tue Feb 15, 2022 2:36 pm

Wed Feb 16, 2022 1:13 am

Hello pbennett,

Thanks for your inquiry!

Of course, I will also inform you here once there is any progress regarding the issue SPIREPDF-4884. Sorry for the inconvenience caused.

Sincerely,
Marcia
E-iceblue support team
User avatar

Marcia.Zhou
 
Posts: 858
Joined: Wed Nov 04, 2020 2:29 am

Tue Jul 05, 2022 9:20 am

Hello,

Thanks for your patience!

Glad to inform you that we just released Spire.Pdf Pack(Hot Fix) Version:8.7.2 which fixes your issue SPIREPDF-4884.

Please download the fix version from the following links to test.
Website link:
https://www.e-iceblue.com/Download/down ... t-now.html
Nuget link:
https://www.nuget.org/packages/Spire.PDF/8.7.2

When testing, please use the new method below.
Code: Select all
doc.FindText(searchPatternText, true, TextFindParameter.CrossLine);
Sincerely,
Andy
E-iceblue support team
User avatar

Andy.Zhou
 
Posts: 483
Joined: Mon Mar 29, 2021 3:03 am

Return to Spire.PDF