How to Detect and Remove Blank Pages in PDF in C#

PDF created by scanning paper documents in duplex mode may contain blank pages inside. Blank pages may also be inserted intentionally. In this article, you will learn how to detect and remove blank pages from a PDF file using Spire.PDF with C#.

A blank page is broadly defined as a page that contains nothing in it. Spire.PDF provides a method IsBank to detect if a PDF page is absolutely blank. However, some “blank pages” can actually contain white images, which won't be deemed as blank using IsBank method. To detect these white but not blank pages, we use a custom method IsImageBlank defined in step 1.

Note: This solution will convert PDF pages into images and detect if an image is blank. It is necessary to apply a license to remove the evaluation message in the converted images. Otherwise, this method won't work properly. If you do not have a license, contact sales@e-iceblue.com for a temporary one for evaluation purpose.

Step 1: Create a custom method to judge if an image is blank.

public static bool IsImageBlank(Image image)
{
    Bitmap bitmap = new Bitmap(image);
    for (int i = 0; i < bitmap.Width; i++)
    {
        for (int j = 0; j < bitmap.Height; j++)
        {
            Color pixel = bitmap.GetPixel(i, j);
            if (pixel.R < 240 || pixel.G < 240 || pixel.B < 240)
            {
                return false;
            }
        }
    }
    return true;
}

Step 2: As stated above, this solution requires to apply a license. After that, you can invoke IsBlank method and IsImageBlank method to detect is a PDF page is blank or contains blank images, and remove the blank page using Pages.RemoveAt(int index) method.

static void Main(string[] args)
{
    //apply license
    Spire.License.LicenseProvider.SetLicenseFileName("license.elic.xml");

    //create a PdfDocument object
    PdfDocument document = new PdfDocument();

    //load a sample file
    document.LoadFromFile("sample.pdf");

    //traverse each page in the document 
    for (int i = document.Pages.Count - 1; i >= 0; i--)
    {                           
        //detect if a page is blank
        if (document.Pages[i].IsBlank())
        {
            //remove the blank page 
            document.Pages.RemoveAt(i);
        }
        else
        {
            //save PDF page as image
            Image image = document.SaveAsImage(i, PdfImageType.Bitmap);

            //detect if a page contains blank image
            if (IsImageBlank(image))
            {
                //remove the page that contains blank image
                document.Pages.RemoveAt(i);
            }
        }

    }

    //save to file
    document.SaveToFile("RemoveBlankPage.pdf", FileFormat.PDF);
}

Output:

How to Detect and Remove Blank Pages in PDF in C#

Full C# Code:

static void Main(string[] args)
{
    //apply license
    Spire.License.LicenseProvider.SetLicenseFileName("license.elic.xml");

    //create a PdfDocument object
    PdfDocument document = new PdfDocument();

    //load a sample file
    document.LoadFromFile("sample.pdf");

    //traverse each page in the document 
    for (int i = document.Pages.Count - 1; i >= 0; i--)
    {                           
        //detect if a page is blank
        if (document.Pages[i].IsBlank())
        {
            //remove the blank page 
            document.Pages.RemoveAt(i);
        }
        else
        {
            //save PDF page as image
            Image image = document.SaveAsImage(i, PdfImageType.Bitmap);

            //detect if a page contains blank image
            if (IsImageBlank(image))
            {
                //remove the page that contains blank image
                document.Pages.RemoveAt(i);
            }
        }
    }

    //save to file
    document.SaveToFile("RemoveBlankPage.pdf", FileFormat.PDF);
}

//judge if an image is blank
public static bool IsImageBlank(Image image)
{
    Bitmap bitmap = new Bitmap(image);
    for (int i = 0; i < bitmap.Width; i++)
    {
        for (int j = 0; j < bitmap.Height; j++)
        {
            Color pixel = bitmap.GetPixel(i, j);
            if (pixel.R < 240 || pixel.G < 240 || pixel.B < 240)
            {
                return false;
            }
        }
    }
    return true;
}