PDF created by scanning paper documents in duplex mode may contain blank pages inside. Blank pages may also be inserted intentionally. In this article, you will learn how to detect and remove blank pages from a PDF file using Spire.PDF with C#.
A blank page is broadly defined as a page that contains nothing in it. Spire.PDF provides a method IsBank to detect if a PDF page is absolutely blank. However, some “blank pages” can actually contain white images, which won't be deemed as blank using IsBank method. To detect these white but not blank pages, we use a custom method IsImageBlank defined in step 1.
Note: This solution will convert PDF pages into images and detect if an image is blank. It is necessary to apply a license to remove the evaluation message in the converted images. Otherwise, this method won't work properly. If you do not have a license, contact sales@e-iceblue.com for a temporary one for evaluation purpose.
Step 1: Create a custom method to judge if an image is blank.
public static bool IsImageBlank(Image image) { Bitmap bitmap = new Bitmap(image); for (int i = 0; i < bitmap.Width; i++) { for (int j = 0; j < bitmap.Height; j++) { Color pixel = bitmap.GetPixel(i, j); if (pixel.R < 240 || pixel.G < 240 || pixel.B < 240) { return false; } } } return true; }
Step 2: As stated above, this solution requires to apply a license. After that, you can invoke IsBlank method and IsImageBlank method to detect is a PDF page is blank or contains blank images, and remove the blank page using Pages.RemoveAt(int index) method.
static void Main(string[] args) { //apply license Spire.License.LicenseProvider.SetLicenseFileName("license.elic.xml"); //create a PdfDocument object PdfDocument document = new PdfDocument(); //load a sample file document.LoadFromFile("sample.pdf"); //traverse each page in the document for (int i = document.Pages.Count - 1; i >= 0; i--) { //detect if a page is blank if (document.Pages[i].IsBlank()) { //remove the blank page document.Pages.RemoveAt(i); } else { //save PDF page as image Image image = document.SaveAsImage(i, PdfImageType.Bitmap); //detect if a page contains blank image if (IsImageBlank(image)) { //remove the page that contains blank image document.Pages.RemoveAt(i); } } } //save to file document.SaveToFile("RemoveBlankPage.pdf", FileFormat.PDF); }
Output:
Full C# Code:
using Spire.Pdf; using Spire.Pdf.Graphics; using System.Drawing; namespace DeleteBlankPage { class Program { static void Main(string[] args) { //apply license Spire.License.LicenseProvider.SetLicenseFileName("license.elic.xml"); //create a PdfDocument object PdfDocument document = new PdfDocument(); //load a sample file document.LoadFromFile("sample.pdf"); //traverse each page in the document for (int i = document.Pages.Count - 1; i >= 0; i--) { //detect if a page is blank if (document.Pages[i].IsBlank()) { //remove the blank page document.Pages.RemoveAt(i); } else { //save PDF page as image Image image = document.SaveAsImage(i, PdfImageType.Bitmap); //detect if a page contains blank image if (IsImageBlank(image)) { //remove the page that contains blank image document.Pages.RemoveAt(i); } } } //save to file document.SaveToFile("RemoveBlankPage.pdf", FileFormat.PDF); } //judge if an image is blank public static bool IsImageBlank(Image image) { Bitmap bitmap = new Bitmap(image); for (int i = 0; i < bitmap.Width; i++) { for (int j = 0; j < bitmap.Height; j++) { Color pixel = bitmap.GetPixel(i, j); if (pixel.R < 240 || pixel.G < 240 || pixel.B < 240) { return false; } } } return true; } } }