I am currently evaluating libraries to extract text from PDF files.
With Spire.PDF i have the issue that a file which appears rotated by 90 degrees when opened in a PDF viewer, the text will not appear in the right order. It basically extracts the text as if the file was rotated correctly. Something like this (imagine all letters rotated by 90° to the right)
- Code: Select all
P S T
a o i
g m t
e e l
e
1 T
e
x
t
What i'd expect is the following extracted text:
Title
Some Text
Page 1
But what i get is:
Page Some Title
1 Text
Is there a way to solve this with Spire.PDF? Another library i'm evaluating extracts the text from this file just fine (but has serious issues with other files which in turn Spire.PDF handles perfectly).
I'm sorry i can't provide the PDF in question since we received it from a customer of ours and it contains confidential information.