Spire.Doc is a professional Word .NET library specifically designed for developers to create, read, write, convert and print Word document files. Get free and professional technical support for Spire.Doc for .NET, Java, Android, C++, Python.

Thu Jan 04, 2024 8:13 pm

Hello Team,

I am currently facing 2 issues regarding Word to HTML with Spire Doc:

1. In the case where we have blank spaces with underlines in Word, when converting to HTML, these spaces with underlines are not converted and thus missing in the HTML output.
For example:
Word text - "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.(Blank spaces with underlines here) " - It seems that when I try to underline blank spaces in this forum box, they also disappear. For testing please add 3 or 4 blank spaces with underlines.

HTML output -
Code: Select all
<div data-wrapper="true" style="font-size:11pt"><div style="list-style-position:inside; list-style-type:none"><span style="font-size:11pt">Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.</span></div></div>


2. In converting from Word to HTML, with have a specific scenario where a span tag with a different font style is being added in the middle of a paragraph. This is only occurring with this specific sequence of words:
"Equipment at Tenant’s reasonable expense. Landlord will be liable to Tenant for any damage to Tenant’s Equipment"

In word, this entire paragraph is in Times new roman and size 11 text. However, once converted to HTML, the remainder of the paragraph after the "." is converted to Arial Unicode MS font:

Code: Select all
Equipment at Tenant's reasonable expense. </span><span class="None" style="font-family:'Arial Unicode MS';"> Landlord will be liable to Tenant for any damage to Tenant's Equipment"


We are unable to reproduce this issue with any other pragraphs or texts. Only specifically with this one passage is where this bug appears.

Please note that I am using the latest hotfix version of Spire doc and I use Windows 11 64-bit operating system and my region setting is United States and English.

Thank you for the support!
Jin Kang

kyungjinkang
 
Posts: 2
Joined: Thu Jan 04, 2024 7:39 pm

Fri Jan 05, 2024 6:32 am

Hello,

Thank you for your inquiry.
Based on your description, I have simulated a Word document and conducted preliminary tests using the latest version of SpireDoc (Spire.Doc Pack Version: 11.12). However, I was unable to reproduce the two issues you mentioned. I have attached my test document and the resulting file for your reference.
Code: Select all
// Create a new instance of Document
Document document = new Document();

// Load the Word document from "input.docx"
document.LoadFromFile(@"input.docx");

// Set the CSS style sheet type to Internal for HTML export
document.HtmlExportOptions.CssStyleSheetType = CssStyleSheetType.Internal;

// Embed images in the HTML output
document.HtmlExportOptions.ImageEmbedded = true;

// Convert text input form fields to plain text in the HTML output
document.HtmlExportOptions.IsTextInputFormFieldAsText = true;

// Save the document as HTML with the file name "Sample.html"
document.SaveToFile("Sample.html", FileFormat.Html);

To further investigate your problems accurately, could you please provide us with your test document and your application type, such as Console app (. Net Framework 4.5)? This will assist us in analyzing the situation more effectively and providing you with a suitable solution. You could attach them here or send them to us via email ([email protected]). Thanks in advance.

Sincerely,
Annika
E-iceblue support team
User avatar

Annika.Zhou
 
Posts: 1657
Joined: Wed Apr 07, 2021 2:50 am

Tue Jan 09, 2024 5:52 pm

Hello,

Thank you for your swift response and support! I did forget to mention that we are utilizing content control in word. Could you please try reproducing again with the provided screenshots?

As for the underline issue, we were also unable to reproduce again and thus please ignore this issue.

Thank you,
Jin Kang

kyungjinkang
 
Posts: 2
Joined: Thu Jan 04, 2024 7:39 pm

Wed Jan 10, 2024 1:58 am

Hello,

Thank you for your feedback.
Based on the information you provided, I simulated a Word document and conducted further testing. However, I was unable to reproduce the issue you mentioned. Attached are my test files, including the sample document and the resulting output file.
Please note that the structure and content of different documents may vary, which can lead to different results after conversion. In order for us to investigate your issue more effectively, could you please provide the source document you used for testing? If there are any sensitive information involved, please make sure to remove it while ensuring that the issue can still be reproduced. You could attach it here or send it to us via email ([email protected]). Thanks in advance.
Once we receive the source document, we will continue our investigation and provide you with further assistance.

Sincerely,
Annika
E-iceblue support team
User avatar

Annika.Zhou
 
Posts: 1657
Joined: Wed Apr 07, 2021 2:50 am

Thu Jan 11, 2024 5:49 am

Hello,

Thank you for sharing the information via email.
I have tested the Word file you provided and was able to reproduce the issue you mentioned regarding the modification of text fonts. I have logged the issue into our bug tracking system with the ticket number SPIREDOC-10211. Our development team will investigate and fix it. Once it is resolved, I will inform you in time. Sorry for the inconvenience caused.

Sincerely,
Annika
E-iceblue support team
User avatar

Annika.Zhou
 
Posts: 1657
Joined: Wed Apr 07, 2021 2:50 am

Thu Feb 29, 2024 11:54 pm

Hi Team, We would also like to share below issue regarding font size changes from 11 to 12; when converting Word to HTML or vice versa using SPIRE Doc API.

Please find the original html named "original.html" in the zip file attached.

After the first conversion from HTML to Word, the font size still remain 11, though the font style of the header is changed to 'Arial'.
Please refer the file "Spire Doc issue repro - first html to word conversion.docx" in the zip file attached. Moreover, there is inconsistent spacing is different places.
Spacing issue_1.png
Spacing issue_2.png


Then, we modify the content of the word file and convert the Word to HTML using SPIRE Doc. To find the converted HTML, please refer the file named "component after first word to html conversion with modified content.html" in the zip file attached.

Then we again convert the HTML to Word using SPIRE Doc and we notice that the font size has been changed to 12 in many places. Please refer the file "Spire Doc issue repro - second html to word conversion.docx" in the zip file attached.



Thanks for support!

somanwita21
 
Posts: 8
Joined: Mon Feb 27, 2023 9:04 pm

Fri Mar 01, 2024 6:30 am

Hello somanwita21,

Thank you for reaching out with your inquiry.
Based on your description, I have conducted testing on the "original.html" file you provided. I observed that in the resulting Word document, the font for h1 is "Arial." Upon inspecting the HTML code, I noticed that there was no font specified for h1. After adding "font-family:'Times New Roman'" within the u tag and retesting, the h1 font in the Word file indeed changed to "Times New Roman." Therefore, I recommend specifying the font in the HTML code to ensure consistency.
Additionally, regarding your mention of converting HTML to Word, then Word back to HTML, and finally HTML back to Word resulting in a change in font size from 11 to 12, my testing revealed that only the font size for blank cells was 12, while the rest remained at 11. This differs significantly from the styling in the file you provided, "Spire Doc issue repro - second html to word conversion.docx." To further investigate and address your issue effectively, could you kindly provide the complete testing code and input files (if available)? Your cooperation in this matter is greatly appreciated.

Sincerely,
Annika
E-iceblue support team
User avatar

Annika.Zhou
 
Posts: 1657
Joined: Wed Apr 07, 2021 2:50 am

Return to Spire.Doc