I'm currently trying to get lines of text from parts of a table in a pdf.
A line looks something like this:
- Code: Select all
texttexttext 121964 498734973 237239 7634634
getting the text with "ExtractText(RectangleF rectangeleF)" and also "ExtractText(RectangleF rectangle, SimpleTextExtractionStrategy sim)" usually results in
- Code: Select all
texttexttext1219644987349732372397634634
The regular ExtractText() doesn't work in my current scenario.
Is there any way to do that and not get this completely useless data?
I don't care if it's just a workaround by (miss-)using other methods.
By the way, the result of PdfTextFind (in case I search for texttexttext) contains a property named LineText, which would work as well, but also returns the same useless stuff.
At the moment I could think of these "formats", that would work for me:
- - not just concatenated without space, but with a delimiter
- not concatenated at all but getting an array of strings
- and by far the best: the same structure as on any other text search: an array of PdfTextFind-objects
Is there any way, that comes to mind?
Thank you very much