Spire.PDF is a professional PDF library applied to creating, writing, editing, handling and reading PDF files without any external dependencies. Get free and professional technical support for Spire.PDF for .NET, Java, Android, C++, Python.

Thu Jul 17, 2014 9:57 am

Hi,
I'm trying to find a text in a PDF document with FindText method, but I have no result if the PDF is converted from DOCX by spire.doc, while FindText works if the same document is exported in PDF from Word 2010. FindText works also with other PDF documents.

This is my code to convert from DOCX:

Spire.Doc.Document doc = new Spire.Doc.Document();
doc.LoadFromStream(msIn, Spire.Doc.FileFormat.Auto);

var pdfParams = new ToPdfParameterList()
{
IsEmbeddedAllFonts = true,
IsHidden = true,
PdfConformanceLevel = PdfConformanceLevel.Pdf_A1B
};
doc.SaveToStream(msOut, pdfParams);

This is my code to find the text in converted PDF:

System.Threading.Thread.CurrentThread.CurrentCulture = CultureInfo.InvariantCulture;
System.Threading.Thread.CurrentThread.CurrentUICulture = CultureInfo.InvariantCulture;
byte[] result = null;
Spire.Pdf.PdfDocument pdf = new Spire.Pdf.PdfDocument();
pdf.LoadFromBytes(barr);
PdfTextFindCollection words = pdf.Pages[1].FindText("Carota");

Thanks.

pld
 
Posts: 6
Joined: Tue Mar 04, 2014 5:23 pm

Fri Jul 18, 2014 7:33 am

Hello,

Thanks for you feedback.
We have noticed the issue you mentioned, which has been posted to our Dev team, once there is any update, we will let you know.
If there are any questions, welcome to get it back to us.
Sincerely,
Gary
E-iceblue support team
User avatar

Gary.zhang
 
Posts: 1380
Joined: Thu Apr 04, 2013 1:30 am

Thu Aug 14, 2014 8:19 am

Hello,

Sorry that let you wait for a long time.
The issue has been resolved, and the newest hotfix has been released, you could download Spire.PDF Pack(Hot Fix) Version:3.1.24 and try it.
If there are any questions, welcome to get it back to us.
Sincerely,
Gary
E-iceblue support team
User avatar

Gary.zhang
 
Posts: 1380
Joined: Thu Apr 04, 2013 1:30 am

Tue Aug 19, 2014 9:43 am

Hello,

Have you tested the hotfix? Has your issue been resolved? Could you please give us some feedback if convenience?

If there are any questions, welcome to get it back to us.
Thanks,
Gary
E-iceblue support team
User avatar

Gary.zhang
 
Posts: 1380
Joined: Thu Apr 04, 2013 1:30 am

Thu Feb 12, 2015 3:15 pm

Hi,

in the last four days I have tested your library "spire.pdf" as my company is looking exactly for such a library. A main point why I am testing the library is the function "FindText". Unfortunately it does not work, after converting a .docx via spire.doc to a .pdf document. But it still works fine while using a .pdf which has been created directly out of Word. So if the bug will be fixed in a few days I would buy the spire.office licence for my company.

Thank you in advance.

Dirk

dirk
 
Posts: 1
Joined: Wed Feb 11, 2015 9:58 am

Fri Feb 13, 2015 8:24 am

Hello,

Thanks for your inquriy.
Please attach the document and the code you tried for our testing . If it is inconvenient to attach here, you could send it to us(support@e-iceblue.com) via email.

Best Regards,
Betsy
User avatar

Betsy
 
Posts: 802
Joined: Mon Jan 19, 2015 6:14 am

Wed Feb 03, 2016 5:23 am

I am also having troubles with FindText in the Free version, 3.2.52.56040, as obtained from NUGET. This is NOT a problem in the latest eval pro version however, version 3.6.135. I really need it to work in the Free version and be successful to convince my company to purchase the full version when this project grows to needing it. There's other issues with FindText in the current Pro version depending on how the PDF was created, but with the Free version, the found text items' Position values are highly incorrect, all seeming returning the same area of the PDF and nowhere near the actual text found. When I use the pro eval version the Position's are correct (provided I export the PDF from Word, and do not include Fonts as bitmaps) however. SO, I believe this is a bug in the earlier 3.2 version.

Here's my code that simply draws a rectangle where the found text supposedly exists. But again, the Position is way off. This routine is called in my code about 40-50 times, with different search text, but all the rectangles end up in the upper right area atop each other and nowhere near any of the actual text locations, except the first one I look for.

Code: Select all
for (int i = 0; i < m_SourceDoc.Pages.Count; i++)
{
  PdfPageBase sourcePage = m_SourceDoc.Pages[i];

  PdfTextFind[] results = sourcePage.FindText(a_sKeyText).Finds;
  foreach (PdfTextFind find in results)
  {
    // Diagnosis rectangles
    PdfPen penCyan = new PdfPen(Color.Cyan, .5f);
    sourcePage.Canvas.DrawRectangle(penCyan, new RectangleF(find.Position, find.Size));
  }
}


Thanks.

PilotMelch
 
Posts: 1
Joined: Tue Jan 12, 2016 11:28 pm

Wed Feb 03, 2016 8:18 am

Hi,

Thanks for your posting.
I recommend you to use trial version because we only maintain our free version of Spire.PDF when we have enough time.
We have sent you a 1-month free license to your e-mail(john.melchert@omnav.com). Please check it.
Please feel free to contact us if you have any issues.

Best Regards,
Amy
E-iceblue support team
User avatar

amy.zhao
 
Posts: 2766
Joined: Wed Jun 27, 2012 8:50 am

Tue Apr 09, 2019 8:45 am

This problem still exists in 4.4.0.....
It troubles me a lot

ninkun
 
Posts: 6
Joined: Fri Mar 22, 2019 12:48 am

Tue Apr 09, 2019 9:08 am

Hello Ninkun,

Thanks for your post.
Did you mean that you encounter problems when using our latest Spire.Office Version:4.4.0? If so, to help us have a better investigation, please provide your input file, your full testing code as well as the detailed information about your problems. You could send them to us via email (support@e-iceblue.com). Thanks in advance.

Sincerely,
Lisa
E-iceblue support team
User avatar

Lisa.Li
 
Posts: 1261
Joined: Wed Apr 25, 2018 3:20 am

Wed Apr 10, 2019 1:19 am

I tested a lot of solutions. I finally got how to make it works!!!!!!
If the PDF document is made by Spire.Doc, "Use PS Conversion" in ToPdfParameterList has to be "true".
Only after that property be true, FindText in Spire.PDF can work well...

I think you shoud fix it a lot!

The original code has been sent to your email ((support@e-iceblue.com)). You can see......

ninkun
 
Posts: 6
Joined: Fri Mar 22, 2019 12:48 am

Wed Apr 10, 2019 6:05 am

Hello Ninkun,

Thanks for your email.
I tested your case with Spire.Office Platinum (DLL Only) Version:4.4.0, but I didn’t reproduce your issue when using the common conversion method like your provided in email. I can find the text “SQL”. Please run the .exe application that attached in my reply of email and then tell us your testing result.

Sincerely,
Lisa
E-iceblue support team
User avatar

Lisa.Li
 
Posts: 1261
Joined: Wed Apr 25, 2018 3:20 am

Return to Spire.PDF