Spire.PDF is a professional PDF library applied to creating, writing, editing, handling and reading PDF files without any external dependencies within .NET ( C#, VB.NET, ASP.NET, .NET Core) application and Java (J2SE and J2EE) application.

Mon Oct 17, 2011 1:09 pm

Hello,
I try export the "plain text" from the existing PDF file.
I used code from the example on my own PDF file (with czech chars):

PdfDocument doc = new PdfDocument();
doc.LoadFromFile("PDFWithDiacriticChars.pdf");
StringBuilder buffer = new StringBuilder();

foreach (PdfPageBase page in doc.Pages)
buffer.Append(page.ExtractText());
doc.Close();
String fileName = "PlainTextFromPdf.txt";
File.WriteAllText(fileName, buffer.ToString());


The result text is without czech chars. They are ommited.


thanks for reply
Jakub
Last edited by jmosna on Fri Oct 21, 2011 9:03 am, edited 1 time in total.

jmosna
 
Posts: 3
Joined: Wed Oct 05, 2011 1:04 pm

Tue Oct 18, 2011 4:06 am

Hello Jakub,
Sorry for any inconveniences caused by us and thank you for your patience with our reply.

Sorry for that Spire.Pdf can’t support to export the “plain text” with crech now. In next phase we will add the function. Thank you for your mention.
Have a nice day.
Tina
Technical Support/Developer,
e-iceblue Support Team
User avatar

Tina.Lin
 
Posts: 152
Joined: Tue Sep 13, 2011 5:37 am

Wed Dec 12, 2012 1:25 pm

Is there any change...?

avner
 
Posts: 5
Joined: Tue Dec 11, 2012 7:40 am

Thu Dec 13, 2012 9:06 am

Hello,

Thanks for your inquiry.
Spire.pdf can support extracting the crech text from the PDF file. We provide you some sample code. Please try to use Spire.PDF_2.6.32(the download link is http://www.e-iceblue.com/Download/download-pdf-for-net-now/spirepdf-packhot-fix2632.html?Itemid=0) to test it. If you encounter any problem, please tell us.
Code: Select all
            private void btnExtractText_Click(object sender, EventArgs e)
            {
                PdfDocument doc = new PdfDocument();
                doc.LoadFromFile(filepath);

                StringBuilder buffer = new StringBuilder();
                foreach (PdfPageBase page in doc.Pages)
                {
                    buffer.Append(page.ExtractText());
                }
                richTextBox1.Text=buffer.ToString();
            }



Best regards,
Amy
E-iceblue support team
User avatar

amy.zhao
 
Posts: 2328
Joined: Wed Jun 27, 2012 8:50 am

Mon Dec 17, 2012 10:10 am

Hello,

Could you please tell us your test result? If you encounter any problem, please feel free to contact us.

Best regards,
Amy
E-iceblue support team
User avatar

amy.zhao
 
Posts: 2328
Joined: Wed Jun 27, 2012 8:50 am

Mon Dec 17, 2012 11:38 am

Works bad...

avner
 
Posts: 5
Joined: Tue Dec 11, 2012 7:40 am

Tue Dec 18, 2012 2:23 am

Hello avner,

Thanks for your feedback.
We are sorry for the inconvenience.
Could you please tell us the error you encountered? Also could you please provide us your test pdf file? So that we can reproduce your problem. If it is a bug of spire.pdf, our dev team will try their best to fix it soon. Thank you!

Best regards,
Amy
E-iceblue support team
Last edited by amy.zhao on Fri Dec 21, 2012 4:03 am, edited 1 time in total.
User avatar

amy.zhao
 
Posts: 2328
Joined: Wed Jun 27, 2012 8:50 am

Fri Dec 21, 2012 4:02 am

Hello avner,

Sorry to bother you again.
Could you please tell us the problems you encountered? And could you please provide us some files to reproduce the problems? If they are the bugs of our products, we will try our best to fix them soon. Thank you!

Best regards,
Amy
E-iceblue support team
User avatar

amy.zhao
 
Posts: 2328
Joined: Wed Jun 27, 2012 8:50 am

Return to Spire.PDF