Spire.PDF is a professional PDF library applied to creating, writing, editing, handling and reading PDF files without any external dependencies. Get free and professional technical support for Spire.PDF for .NET, Java, Android, C++, Python.

Fri Jun 18, 2021 6:48 am

I need to completely remove the text layer from a PDF (the PDF will later be OCRd), is it possible to do this without converting the PDF to images?

wraydc
 
Posts: 130
Joined: Wed Apr 11, 2018 5:14 am

Fri Jun 18, 2021 9:45 am

Hello,

Thanks for your inquiry.
Does the layer you mentioned similar to the screenshot below (in Adobe)? If so, our Spire.PDF supports removing layers by name, you can refer to the following code to test.
screenshot.png

Code: Select all
              PdfDocument doc = new PdfDocument();
              doc.LoadFromFile("Test.pdf");
              doc.Layers.RemoveLayer("Layer Name");
              doc.SaveToFile("Output.pdf");


Or if I misunderstood, to help us better understand your requirement, please provide your PDF document and the output document you expect for our reference. Thanks in advance.
Sincerely,
Andy
E-iceblue support team
User avatar

Andy.Zhou
 
Posts: 483
Joined: Mon Mar 29, 2021 3:03 am

Fri Jun 18, 2021 10:27 am

Andy,

Thanks for the quick response. The PDF doesn't present the text "layer" in that way - I guess what I'm looking for is the equivalent to the Save as PDF Image File in Adobe, so the existing OCRd text is removed.

Darren.

wraydc
 
Posts: 130
Joined: Wed Apr 11, 2018 5:14 am

Mon Jun 21, 2021 10:08 am

Hi Darren,

Thanks for your inquiry and sorry for late reply on weekend.
Our Spire.PDF currently does not support the features you mentioned. I'm afraid you can only convert PDF to images. Apologize for the inconvenience caused.
Sincerely,
Andy
E-iceblue support team
User avatar

Andy.Zhou
 
Posts: 483
Joined: Mon Mar 29, 2021 3:03 am

Return to Spire.PDF