Spire.PDF is a professional PDF library applied to creating, writing, editing, handling and reading PDF files without any external dependencies. Get free and professional technical support for Spire.PDF for .NET, Java, Android, C++, Python.

Tue Mar 05, 2024 2:01 pm

Hi,
How can i delete blank page from a certain page of a PDF document using Spire.PDF in a Powershell script?
I would like to test, using Spir.pdf functions in a Powershell script, a specific text area of a page of a PDF document to see if the PDF has a blank page or not. If the pdf document has a blank page, I would like to remove it from the pdf.
How could I do it?
Thanks a lot for your help.

Best regard,
Jean-Maurice

jmboisaubert
 
Posts: 6
Joined: Tue Mar 05, 2024 1:44 pm

Wed Mar 06, 2024 5:31 am

Hello,

Thank you for your inquiry.
If you are looking to use our Spire.PDF to remove blank pages from PDF files in a PowerShell script, please refer to the attached .ps1 script for guidance.
Should you have any further questions or need additional assistance, please feel free to reach out.

Sincerely,
Annika
E-iceblue support team
User avatar

Annika.Zhou
 
Posts: 1657
Joined: Wed Apr 07, 2021 2:50 am

Wed Mar 06, 2024 6:49 am

Hi Annika,

Thanks a lot for your reply.
But i need, before remove the blank page, to test a specific text area of the body page of a PDF document to see if the PDF has a really blank page or not. It's beacause i can have for example the header of the page filled in and not the body. So if that's the case, I have to delete the page.
Awaiting your reply
Have a nice day.

Sincerely,
Jean-Maurice

jmboisaubert
 
Posts: 6
Joined: Tue Mar 05, 2024 1:44 pm

Wed Mar 06, 2024 9:26 am

Hello,

Thank you for your feedback.
I apologize for overlooking your request to test a specific text extraction area. Please refer to the following code snippet to implement text extraction from a specific region:
Code: Select all
# Set the path to the Spire.Pdf.dll file
$converterpath = 'F:\demo\'

# Load the Spire.Pdf.dll assembly
Add-Type -Path "$($converterpath)\Spire.Pdf.dll"

# Create a new PdfDocument object
$pdfDocument = New-Object Spire.Pdf.PdfDocument
# Load a PDF file
$pdfDocument.LoadFromFile("F:\demo\input.pdf")

# Loop through each page in the PDF document
for ($i = 0; $i -lt $pdfDocument.Pages.Count; $i++) {
    # Get the current page
    $page = $pdfDocument.Pages[$i]
   
    # Create text extraction options
    $options = New-Object Spire.Pdf.Texts.PdfTextExtractOptions
    $options.IsExtractAllText = $true
    $options.IsShowHiddenText = $true

    # Define the extraction area on the page
    $options.ExtractArea = New-Object System.Drawing.RectangleF(0, 0, 612, 40)
   
    # Create a PdfTextExtractor object for the current page
    $pdfTextExtractor = New-Object Spire.Pdf.Texts.PdfTextExtractor($page)

    # Extract text from the page
    $text = $pdfTextExtractor.ExtractText($options)
   
    # Output the extracted text
    Write-Output $text
   
    # Uncomment the following lines if you want to remove blank pages
    # if ($page.IsBlank()) {
    #     $pdfDocument.Pages.Remove($page)
    #     $i--
    # }
}

Please feel free to reach out if you encounter any issues or require further assistance. Your feedback is valuable to us as we strive to improve our services.

Sincerely,
Annika
E-iceblue support team
User avatar

Annika.Zhou
 
Posts: 1657
Joined: Wed Apr 07, 2021 2:50 am

Thu Mar 07, 2024 8:33 am

Hi Annika,

Thanks a lot for your response!
Can you tell me please how can I display the different blocks of text in a PDF document and their respective coordinates with Spire.PDF using Powershell script?
Thanks for your help.

Sincerely,
Jean-Maurice

jmboisaubert
 
Posts: 6
Joined: Tue Mar 05, 2024 1:44 pm

Fri Mar 08, 2024 2:10 am

Hello,

Thank you for your feedback.

Please note that currently, in order to obtain the position information of text in a PDF document, our Spire.PDF library supports retrieving text position information through text search functionality. You can refer to the code snippet provided below for reference:
Code: Select all
# Set the path to the Spire.Pdf.dll file
$converterpath = 'F:\demo\'

# Load the Spire.Pdf.dll assembly
Add-Type -Path "$($converterpath)\Spire.Pdf.dll"

# Create a new Spire.Pdf.PdfDocument object
$pdf = New-Object Spire.Pdf.PdfDocument

# Load a PDF file from the specified path
$pdf.LoadFromFile("Input.pdf")

# Create a new Spire.Pdf.Texts.PdfTextFindOptions object
$pdfTextFindOptions = New-Object Spire.Pdf.Texts.PdfTextFindOptions

# Set parameters for text search to ignore case and search for whole words
$pdfTextFindOptions.Parameter = [Spire.Pdf.Texts.TextFindParameter]::IgnoreCase -bor [Spire.Pdf.Texts.TextFindParameter]::WholeWord

# Iterate through each page in the PDF document
for ($i = 0; $i -lt $pdf.Pages.Count; $i++)
{
    # Create a new Spire.Pdf.Texts.PdfTextFinder object for the current page
    $pdfTextFinder = New-Object Spire.Pdf.Texts.PdfTextFinder($pdf.Pages[$i])
    $pdfTextFinder.Options = $pdfTextFindOptions

    # Find the specified text on the page
    $testlist = $pdfTextFinder.Find("The text you want to find")

    # Iterate through each text fragment found
    foreach ($find in $testlist)
    {
        # Get the bounding rectangle of the text fragment
        $rectangleFs = $find.Bounds

        # Iterate through each rectangle in the bounding rectangle
        foreach ($rectangleF in $rectangleFs)
        {
            $xP = $rectangleF.X
            $yP = $rectangleF.Y
            $width = $rectangleF.Width
            $height = $rectangleF.Height

            # Perform actions with the extracted values as needed
        }
    }
}

If you have any further questions or need assistance with anything else, please feel free to reach out.

Sincerely,
Annika
E-iceblue support team
User avatar

Annika.Zhou
 
Posts: 1657
Joined: Wed Apr 07, 2021 2:50 am

Mon Mar 11, 2024 2:34 pm

Hi Anika,

Hoping your are well!
I would like to know if the rectangle under the column header is empty or not (Cf. document_facture.png).
How could i do with Powershell script and Spire.pdf ?

Thanks for your help.

Sincerely,
Jean-Maurice

jmboisaubert
 
Posts: 6
Joined: Tue Mar 05, 2024 1:44 pm

Tue Mar 12, 2024 2:05 am

Hello,

Thank you for your previous correspondence.

Based on the information you provided, I am uncertain about which specific rectangular box you are referring to in terms of determining whether it is blank or not. In light of the screenshot you shared, I have prepared the following code snippet for your reference. The example code is designed to extract text from tables within a PDF file.
Code: Select all
# Set the path to the Spire.Pdf.dll file
$converterpath = 'F:\demo\'

# Load the Spire.Pdf.dll assembly
Add-Type -Path "$($converterpath)\Spire.Pdf.dll"

# Create a new Spire.Pdf.PdfDocument object
$pdf = New-Object Spire.Pdf.PdfDocument

# Load a PDF file from the specified path
$pdf.LoadFromFile("F:\demo\Input.pdf")

# Create a PdfTableExtractor object
$extractor = New-Object Spire.Pdf.Utilities.PdfTableExtractor($pdf)

# Iterate through each page of the PDF
for ($i = 0; $i -lt $pdf.Pages.Count; $i++) {
    $pdfTables = $extractor.ExtractTable($i)
   
    # Check if tables are extracted
    if ($pdfTables -ne $null -and $pdfTables.Length -gt 0) {
        # Iterate through each table
        for ($tableNum = 0; $tableNum -lt $pdfTables.Length; $tableNum++) {
            # Iterate through each row of the table
            for ($rowNum = 0; $rowNum -lt $pdfTables[$tableNum].GetRowCount(); $rowNum++) {
                # Iterate through each column of the table
                for ($colNum = 0; $colNum -lt $pdfTables[$tableNum].GetColumnCount(); $colNum++) {
                    $text = $pdfTables[$tableNum].GetText($rowNum, $colNum)
                   
                    # Check if the cell is not empty
                    if ($text -ne $null -and $text -ne " ") {
                        Write-Host $text
                    } else {
                        Write-Host "The cell is a blank cell"
                    }
                }
            }
        }
    }
}

Please review the code and let me know if it aligns with your requirements. If you have any further details or specific instructions regarding the analysis of the rectangular box, please do not hesitate to provide them so that I can assist you more effectively.

Sincerely,
Annika
E-iceblue support team
User avatar

Annika.Zhou
 
Posts: 1657
Joined: Wed Apr 07, 2021 2:50 am

Tue Mar 12, 2024 4:46 am

Hi Annika,
I'm referring to the part under "CODE DESIGNATION STOCK R.E. P.U.H.T COEFF HORS TAXE1 QUANTITE HORS TAXE2 TOTAL"
inside the pdf document.
I would like to test if this part is blank or if it's contains some data.

Thanks for your help.
Have a nice day.
Sincerily,
Jean-Maurice

jmboisaubert
 
Posts: 6
Joined: Tue Mar 05, 2024 1:44 pm

Tue Mar 12, 2024 6:18 am

Hello,

Thank you for your feedback.
Based on the information you provided, to extract text from a specified rectangular box, you can achieve this by extracting text from tables or by retrieving text from a specific region. I have already provided you with sample code for both of these methods in my previous correspondence. Please feel free to choose the method that best suits your needs.
If you have any further questions or require additional assistance with implementing these solutions, please don't hesitate to let me know.

Sincerely,
Annika
E-iceblue support team
User avatar

Annika.Zhou
 
Posts: 1657
Joined: Wed Apr 07, 2021 2:50 am

Mon Mar 18, 2024 9:19 am

Hi Annika,

Hope you are doing well!
I would like to draw a rectangle in a page of a pdf document. How could i do this in a Powershell script?
I used the method $options.ExtractArea = New-Object System.Drawing.RectangleF(30, 380, 110, 19) to define the rectangle but i want to visualize it into the pdf document.
Thanks for your help.

Sincerely,
Jean-Maurice BOISAUBERT

jmboisaubert
 
Posts: 6
Joined: Tue Mar 05, 2024 1:44 pm

Tue Mar 19, 2024 4:02 am

Hello,

Thanks for your inquiry.
Please refer to the code below to implement your needs. If you have any further questions, please feel free to write to us at any time.
Code: Select all
# Add the System.Drawing assembly
Add-Type -AssemblyName System.Drawing

# Load the Spire.Pdf assembly
Add-Type -Path "Spire.Pdf.dll"

# Create a PDF document
$doc = New-Object Spire.Pdf.PdfDocument
$doc.LoadFromfile("test.pdf")

# Get the first page of the document
$page = $doc.Pages[0]

# Create a TrueType font using Times New Roman with regular style and size 11
$font = New-Object Spire.Pdf.Graphics.PdfTrueTypeFont((New-Object System.Drawing.Font("Times New Roman", 11, [System.Drawing.FontStyle]::Regular)), $true)

# Set the text color to black
$color = [System.Drawing.Color]::Black

# Create a PdfRGBColor object using the specified color
$pdfColor = New-Object Spire.Pdf.Graphics.PdfRGBColor($color)

# Create a PdfSolidBrush object using the PdfRGBColor
$pdfSolidBrush = New-Object Spire.Pdf.Graphics.PdfSolidBrush($pdfColor)

# Create a PdfPen object using the PdfSolidBrush and line width 0.5
$pen = New-Object Spire.Pdf.Graphics.PdfPen $pdfSolidBrush, 0.5

# Create a rectangle with position (30, 380) and size (110, 19)
$rectangle = New-Object System.Drawing.Rectangle 30, 380, 110, 19

# Draw a rectangle on the page using the pen and rectangle
$page.Canvas.DrawRectangle($pen, $rectangle)

$result = "Drawrectangle-result.pdf"
# Save the modified document to a file
$doc.SaveToFile($result)

Sincerely,
William
E-iceblue support team
User avatar

William.Zhang
 
Posts: 672
Joined: Mon Dec 27, 2021 2:23 am

Return to Spire.PDF

cron