Spire.PDF is a professional PDF library applied to creating, writing, editing, handling and reading PDF files without any external dependencies within .NET ( C#, VB.NET, ASP.NET, .NET Core) application and Java (J2SE and J2EE) application.

Fri Oct 29, 2021 1:03 pm

Hello,

I'm currently trying to get lines of text from parts of a table in a pdf.

A line looks something like this:

Code: Select all
texttexttext         121964          498734973       237239       7634634


getting the text with "ExtractText(RectangleF rectangeleF)" and also "ExtractText(RectangleF rectangle, SimpleTextExtractionStrategy sim)" usually results in

Code: Select all
texttexttext1219644987349732372397634634


The regular ExtractText() doesn't work in my current scenario.

Is there any way to do that and not get this completely useless data?
I don't care if it's just a workaround by (miss-)using other methods.

By the way, the result of PdfTextFind (in case I search for texttexttext) contains a property named LineText, which would work as well, but also returns the same useless stuff.

At the moment I could think of these "formats", that would work for me:
    - not just concatenated without space, but with a delimiter
    - not concatenated at all but getting an array of strings
    - and by far the best: the same structure as on any other text search: an array of PdfTextFind-objects

Is there any way, that comes to mind?

Thank you very much

michael.schroeder2@kit.edu
 
Posts: 1
Joined: Wed Jan 02, 2019 9:04 am

Mon Nov 01, 2021 7:12 am

Hello,

Thanks for your inquiry and sorry for the late reply as weekend.
I simulated a PDF file and did an initial test with our latest version(Spire.PDF Pack(Hot Fix) Version:7.10.4), but I did not reproduce the behavior you mentioned. If you were not using the latest version, I suggest that you firstly try this one again. If the issue still exists after trying, please provide the following information for further investigation. You could attach them here or send them to us via email (support@e-iceblue.com). Thanks in advance.
1) Your sample PDF file.
2) Your test environment, such as OS info (E.g. Windows 7, 64-bit) and region setting (E.g. China, Chinese).
3) Your application type, such as Console app (. Net Framework 4.5).

Sincerely,
Annika
E-iceblue support team
User avatar

Annika.Zhou
 
Posts: 1141
Joined: Wed Apr 07, 2021 2:50 am

Return to Spire.PDF

cron