Spire.Doc is a professional Word .NET library specifically designed for developers to create, read, write, convert and print Word document files. Get free and professional technical support for Spire.Doc for .NET, Java, Android, C++, Python.

Wed Apr 17, 2024 1:58 pm

1. some png and webp images not appearing or shown when HTML converted to docx

2.
Code: Select all
public static String convertHtmlToDocToPdf(String htmlString) throws IOException {

      Path tempHTMLFile = Files.createTempFile("output", ".html");
      Path tempWordFile = Files.createTempFile("HTMLtoWord", ".docx");
      File tempPdfFile = File.createTempFile(
            "WordToPDF1_" + new SimpleDateFormat("yyyyMMddHHmmssSSS").format(new Date()) + "_", ".pdf");
      logger.info(htmlString);
      try {

         FileWriterWithEncoding fw = new FileWriterWithEncoding(tempHTMLFile.toFile().getPath(),
               StandardCharsets.UTF_16);
         fw.write(htmlString);
         fw.close();

         logger.info("tempHTMLFile file created: " + tempHTMLFile.toFile().getPath());

         // Set license key
         com.spire.license.LicenseProvider.setLicenseKey(ConvertConstants.LICENSE_KEY);
         com.spire.license.LicenseProvider.loadLicense();

         com.spire.doc.license.LicenseProvider.setLicenseKey(ConvertConstants.LICENSE_KEY);
         com.spire.doc.license.LicenseProvider.loadLicense();

         Document.setGlobalCustomFontsFolders("Fonts");

         // Create a new Document object
         Document htmlDocument = new Document();

         // Load the HTML file "data/sample input 1.html" into the document
         htmlDocument.loadFromFile(tempHTMLFile.toFile().getPath().toString(), FileFormat.Html);

         // Save the document as a Word document with the name "output/HTMLtoWord.docx"
         htmlDocument.saveToFile(tempWordFile.toFile().getPath().toString(), FileFormat.Docx);


         Document.setGlobalCustomFontsFolders("Fonts");

         // Create a new Document object for reading the saved Word document
         Document wordDocument = new Document();

         // Load the saved Word document "output/HTMLtoWord.docx" into the document
         wordDocument.loadFromFile(tempWordFile.toFile().getPath().toString(), FileFormat.Docx);
         logger.info("tempWordFile file created: " + tempWordFile.toFile().getPath());

         ToPdfParameterList ppl = new ToPdfParameterList();
         ppl.isEmbeddedAllFonts(true);
         ppl.setDisableLink(false);
         wordDocument.setJPEGQuality(40);
         // Save the modified document as a PDF with the name "output/WordToPDF1.pdf"
         wordDocument.saveToFile(tempPdfFile.getAbsolutePath(), ppl);

         logger.info("tempPdfFile file created: " + tempPdfFile.getAbsolutePath());

      } catch (IOException e) {
         throw e;
      }
      return addPageNumberToPDF(tempPdfFile.getAbsolutePath());
   }



3. Manifest-Version: 1.0
Extension-Name: spire.doc
Implementation-Title: spire.doc for java
Implementation-Version: 12.2.2
Implementation-Vendor: E-iceblue Co., Ltd.
Implementation-Vendor-Id: com.spire
Implementation-URL: https://www.e-iceblue.com

4. Application Type: Spingboot JDK 1.8

Dasgupta
 
Posts: 54
Joined: Fri Mar 11, 2022 9:14 am

Thu Apr 18, 2024 5:31 am

Hi,

Thanks for your inquiry.
I open the Html(input.html) with browser(Microsoft Edge), but I didn’t find the image between second paragraph and third paragraph, as shown in the following screenshot.
123.png

In addition, I converted the html(input.html) to word file with Microsoft-OfficeWord, and I can’t find any image, as shown in the following screenshot.
4321.png


Sincerely
Abel
E-iceblue support team
User avatar

Abel.He
 
Posts: 997
Joined: Tue Mar 08, 2022 2:02 am

Mon Apr 22, 2024 2:27 pm

Hello,

Cab you check url present in html file for image is accessible to you via browser and in your network?

Thanks and regards,
Umesh Asodekar

Dasgupta
 
Posts: 54
Joined: Fri Mar 11, 2022 9:14 am

Tue Apr 23, 2024 2:10 am

Hi,

Thanks for your feedback.
I found that two images in your input html file, the type of one is base64, as shown in the following screenshot
11.png

And the typy of other one is a url, I try to access this url via browser, but there exist the limitation of user login, so I can’t access this image. I think this is the reason of lacking of image.

Sincerely
Abel
E-iceblue support team
User avatar

Abel.He
 
Posts: 997
Joined: Tue Mar 08, 2022 2:02 am

Tue Apr 23, 2024 5:57 pm

Hello,

Please find sample input output files attached.

Thanks and regards,
Umesh Asodekar

Dasgupta
 
Posts: 54
Joined: Fri Mar 11, 2022 9:14 am

Wed Apr 24, 2024 6:36 am

Hi,

Thanks for the more message you provided.
I noticed that you replaced the second image url, but the warning message is thrown when I attempt to access this url, as shown in the following screenshot. Therefore, when convert this html to word file, the image also can’t be obtained. I believe that you can access this url through browser due to you have been authorized, but there are not authorized while running the program.
333.jpg


In addition, I had a test, I replace the second url to other image url which can be accessed, the image exists in the result word file and pdf file. I attached the files for your reference.
23.png


Sincerely
Abel
E-iceblue support team
User avatar

Abel.He
 
Posts: 997
Joined: Tue Mar 08, 2022 2:02 am

Return to Spire.Doc

cron