C# Convert PDF to HTML

From Spire.PDF for .NET 3.9.285, it is possible to convert PDF files to HTML in C# and VB.NET. The output format for saving images and texts is SVG when converting PDF to HTML. This article will demonstrate how to use Spire.PDF to save PDF file as HTML format.

The sample PDF file includes image, text and hyperlink.

C# Convert PDF to HTML

We can realize our requirement of converting PDF to HTML in several lines of code. Here are the steps:

Step 1: Create a PDF document and load sample PDF.

PdfDocument pdf = new PdfDocument();
pdf.LoadFromFile("Test.pdf");

Step 2: Use SaveToFile method and set conversion target parameter as FileFormat.HTML.

pdf.SaveToFile("Result.html", FileFormat.HTML);

Here are the screenshot:

C# Convert PDF to HTML

Full codes:

[C#]
PdfDocument pdf = new PdfDocument();
pdf.LoadFromFile("Test.pdf");

pdf.SaveToFile("Result.html", FileFormat.HTML);
[VB.NET]
Dim pdf As New PdfDocument()
pdf.LoadFromFile("Test.pdf")
pdf.SaveToFile("Result.html", FileFormat.HTML)