Spire.PDF is a professional PDF library applied to creating, writing, editing, handling and reading PDF files without any external dependencies. Get free and professional technical support for Spire.PDF for .NET, Java, Android, C++, Python.

Mon Nov 04, 2013 12:11 pm

Hi,

I want to search if the PDF file contains certain Arabic word or not via C# code withOUT iterating over each page in the PDF then extracting text from page then check if the extracted text whether contains that Arabic word text or not.

Is there any direct way to this like
Bool result = PDF.search(“ArabicWord”); or index result = PDF.search(“ArabicWord”);

Thanks

suhaib
 
Posts: 1
Joined: Mon Nov 04, 2013 11:17 am

Tue Nov 05, 2013 2:36 am

Hello,

Thanks for your inquiry.

Sorry that at present our product doesn't support the functionality of searching content directly, and you could refer to the following method, which is also simple.
Code: Select all
PdfDocument pdfdoc = new PdfDocument();
pdfdoc.LoadFromFile(string filename);
Regex patterns = new Regex("ArabicWord");
StringBuilder sb = new StringBuilder();
foreach(PdfPageBase page in pdfdoc.Pages)
{
sb.AppendLine(page.ExtractText());
}
Boolean result=patterns.IsMatch(sb.ToString());


If there are any questions, welcome to get it back to us.

Sincerely,
Gary
E-iceblue support team
User avatar

Gary.zhang
 
Posts: 1380
Joined: Thu Apr 04, 2013 1:30 am

Tue Nov 12, 2013 9:26 am

Hello,

Have you tried the method? Does it fulfill your need? Could you please give us some feedback if convenience?

If there are any questions, welcome to get it back to us.

Thanks,
Gary
E-iceblue support team
User avatar

Gary.zhang
 
Posts: 1380
Joined: Thu Apr 04, 2013 1:30 am

Return to Spire.PDF