From Spire.PDF for .NET 3.9.285, it is possible to convert PDF files to HTML in C# and VB.NET. The output format for saving images and texts is SVG when converting PDF to HTML. This article will demonstrate how to use Spire.PDF to save PDF file as HTML format.
The sample PDF file includes image, text and hyperlink.
We can realize our requirement of converting PDF to HTML in several lines of code. Here are the steps:
Step 1: Create a PDF document and load sample PDF.
PdfDocument pdf = new PdfDocument(); pdf.LoadFromFile("Test.pdf");
Step 2: Use SaveToFile method and set conversion target parameter as FileFormat.HTML.
pdf.SaveToFile("Result.html", FileFormat.HTML);
Here are the screenshot:
Full codes:
[C#]
using Spire.Pdf; namespace ConvertPDFtoHtml { class Program { static void Main(string[] args) { PdfDocument pdf = new PdfDocument(); pdf.LoadFromFile("Test.pdf"); pdf.SaveToFile("Result.html", FileFormat.HTML); } } }
[VB.NET]
Imports Spire.Pdf Namespace ConvertPDFtoHtml Class Program Private Shared Sub Main(args As String()) Dim pdf As New PdfDocument() pdf.LoadFromFile("Test.pdf") pdf.SaveToFile("Result.html", FileFormat.HTML) End Sub End Class End Namespace