extract arabic text from pdf not correctly

Wed Jun 14, 2023 3:42 am

hi~
extract arabic text from pdf, some arabic text is not correct.
origin word is "تقريباً" extracted word is "ﺗﻘﺮﻳﺒﺎ‎ً
"

api version : 9.5.6

Code: Select all: PdfDocument pdf = new PdfDocument(); pdf.loadFromFile("out3.pdf"); PdfPageBase page = pdf.getPages().get(0); //Create a PdfTextExtractor object PdfTextExtractor textExtractor = new PdfTextExtractor(page); //Create a PdfTextExtractOptions object PdfTextExtractOptions extractOptions = new PdfTextExtractOptions(); //Extract text from the page String text = textExtractor.extract(extractOptions);

Wed Jun 14, 2023 6:41 am

Hi,

Thanks for your feedback.
After testing, I have reproduced the issue you mentioned and logged it into our issue tracking system with the ticket number SPIREPDF-6060, our developers will investigate and fix it. Sorry for the inconvenience caused, once the issue is fixed, I will inform you asap.

Best regards,
Triste
E-iceblue support team