Spire.PDF is a professional PDF library applied to creating, writing, editing, handling and reading PDF files without any external dependencies. Get free and professional technical support for Spire.PDF for .NET, Java, Android, C++, Python.

Thu Jun 06, 2024 8:14 am

Dear Support,

I am currently working on a project that requires the usage of the Spire.XLS & Spire.PDF (Spire.Office) library (python based) in a docker container based on windows. I have encountered some difficulties while attempting to convert excel/pdf to html and would greatly appreciate your assistance.

The specifications of my environment are as follows:
Docker Container's Operating System : Windows
Python Version : 3.9.10
Spire.Office (Python) Version : 9.1.0
NodeJS Version : 18.16.0

I attempted to convert the excel/pdf documents to html, but I encountered issues and was unable to succeed. Following is the docker command which is being used to create a docker image based on windows.

Code: Select all

# Use the official Node.js image as the base image for the builder
FROM mcr.microsoft.com/windows/servercore:ltsc2019 AS builder

ENV NODE_VERSION 18.16.0
ENV NODE_DOWNLOAD_URL https://nodejs.org/dist/v${NODE_VERSION}/node-v${NODE_VERSION}-win-x64.zip
 
RUN mkdir C:\nodejs
 
RUN powershell -Command \
    Invoke-WebRequest -Uri %NODE_DOWNLOAD_URL% -OutFile nodejs.zip; \
    Expand-Archive -Path nodejs.zip -DestinationPath C:\nodejs; \
    Remove-Item -Force nodejs.zip

RUN setx /M PATH "%PATH%;C:\nodejs\node-v%NODE_VERSION%-win-x64"
USER ContainerAdministrator

# Set the working directory
WORKDIR /usr/src/app

# Copy package files and install dependencies
COPY package*.json ./

# Copy the rest of the application source code
COPY . .
RUN npm install

# Build the application and prune dev dependencies
RUN npm run build && npm prune --production

# Use the official Node.js image as the base image for the final container
FROM mcr.microsoft.com/windows/servercore:ltsc2019

# Set environment variable for production
ENV NODE_ENV=production

# Set the working directory
WORKDIR /usr/src/app

# Create a logs directory with appropriate permissions
RUN mkdir C:\usr\src\app\logs

# Install required tools and libraries
RUN powershell -Command \
    Add-WindowsFeature Web-Server; \
    Invoke-WebRequest -Uri https://aka.ms/vs/17/release/vs_BuildTools.exe -OutFile vs_buildtools.exe; \
    Start-Process -FilePath vs_buildtools.exe -ArgumentList '--quiet --wait --norestart --add Microsoft.VisualStudio.Workload.VCTools' -NoNewWindow -Wait; \
    Remove-Item -Force vs_buildtools.exe

# Copy dependencies and built application from the builder stage
COPY --from=builder /usr/src/app/node_modules ./node_modules
COPY --from=builder /usr/src/app/dist ./dist
COPY /pythonScript/ ./pythonScript/

# Debug step to list contents of the dist directory
RUN dir C:\usr\src\app\dist

# Install Python and dependencies
RUN powershell -Command \
    Invoke-WebRequest -Uri https://www.python.org/ftp/python/3.9.10/python-3.9.10-amd64.exe -OutFile python-installer.exe; \
    Start-Process python-installer.exe -ArgumentList '/quiet InstallAllUsers=1 PrependPath=1' -NoNewWindow -Wait; \
    Remove-Item -Force python-installer.exe

RUN powershell -Command \     
    Invoke-WebRequest -Uri https://www.e-iceblue.com/downloads/lib/libSkiaSharp.dylib -OutFile libSkiaSharp.dylib;

# Install pip requirements
COPY --from=builder /usr/src/app/pythonScript/requirements.txt ./pythonScript/requirements.txt

RUN pip install --no-cache-dir -r pythonScript/requirements.txt
ENV NODE_VERSION 18.16.0
ENV NODE_DOWNLOAD_URL https://nodejs.org/dist/v${NODE_VERSION}/node-v${NODE_VERSION}-win-x64.zip
 
RUN mkdir C:\nodejs
 
RUN powershell -Command \
    Invoke-WebRequest -Uri %NODE_DOWNLOAD_URL% -OutFile nodejs.zip; \
    Expand-Archive -Path nodejs.zip -DestinationPath C:\nodejs; \
    Remove-Item -Force nodejs.zip

RUN setx /M PATH "%PATH%;C:\nodejs\node-v%NODE_VERSION%-win-x64"

# Command to run the application
CMD ["node", "dist/main.js"]



Following are the conversion code for each types :

Excel to Html

Code: Select all


from spire.xls import *
from spire.xls.common import *
from spire.pdf import *
from sys import *
import shutil

outputFilePath = sys.argv[2] + '/' + os.path.splitext(os.path.basename(sys.argv[1]))[0] + '.HTML'

#create a workbook
workbook = Workbook()
#load a excel document
workbook.LoadFromFile(sys.argv[1])
sheet_count = workbook.Worksheets.Count

for i in range(sheet_count):
    # try:
        sheet = workbook.Worksheets[i] 
        if sheet.Visibility == WorksheetVisibility.Visible:
            fileName = sys.argv[2] + '/' +sheet.Name + '.html'
            row_count = sheet.Rows.Length;
            col_count = sheet.Columns.Length;
            count = 1
            lst_hide_row = []
            lst_hide_row = []
            while (count <= row_count):
                    if(sheet.IsRowVisible(count)==False):
                        #print(count)
                            lst_hide_row.append(count)
                    count = count + 1 
            if(len(lst_hide_row)>0):
                lst_hide_row = sorted(lst_hide_row,reverse=True)
                #print(lst_hide_row)
                for x in lst_hide_row:
                    sheet.DeleteRow(x,1)                   

            row_count = sheet.Rows.Length;
            # print(row_count)
            # print(lst_hide_row)
            sheet.SaveToHtml(fileName)

    # except Exception as error:
    #     print("An error occurred:", type(error).__name__, "–", error)
    #     print("Sheet "+str(i+1)+" not converted")

workbook.ConverterSetting.SheetFitToPage = True
workbook.Dispose()
#os.remove(outputFilePath)
shutil.make_archive(sys.argv[2], 'zip', sys.argv[2])



Issues faced :

2024-06-06 13:33:19 Error: Command failed: python pythonScript/convertExcelToHTML.py "conversions/input/300528754100/temp 2.xlsm" "conversions/output/300528754100"
2024-06-06 13:33:19 Traceback (most recent call last):
2024-06-06 13:33:19 File "C:\usr\src\app\pythonScript\convertExcelToHTML.py", line 40, in <module>
2024-06-06 13:33:19 sheet.SaveToHtml(fileName)
2024-06-06 13:33:19 File "C:\Program Files\Python39\lib\site-packages\plum\function.py", line 642, in __call__
2024-06-06 13:33:19 return self.f(self.instance, *args, **kw_args)
2024-06-06 13:33:19 File "C:\Program Files\Python39\lib\site-packages\plum\function.py", line 592, in __call__
2024-06-06 13:33:19 return _convert(method(*args, **kw_args), return_type)
2024-06-06 13:33:19 File "C:\Program Files\Python39\lib\site-packages\spire\xls\XlsWorksheet.py", line 1114, in SaveToHtml
2024-06-06 13:33:19 CallCFunction(GetDllLibXls().XlsWorksheet_SaveToHtmlF, self.Ptr, filename)
2024-06-06 13:33:19 File "C:\Program Files\Python39\lib\site-packages\spire\xls\common\__init__.py", line 109, in CallCFunction
2024-06-06 13:33:19 raise SpireException(info)
2024-06-06 13:33:19 spire.xls.common.SpireException: TypeInitialization_Type_NoTypeAvailable: at System.Runtime.CompilerServices.ClassConstructorRunner.EnsureClassConstructorRun(StaticClassConstructionContext*) + 0x167
2024-06-06 13:33:19 at System.Runtime.CompilerServices.ClassConstructorRunner.CheckStaticClassConstructionReturnGCStaticBase(StaticClassConstructionContext*, Object) + 0xd
2024-06-06 13:33:19 at sprq9b..ctor(sprray, sprrca) + 0x1a
2024-06-06 13:33:19 at sprq9a.sprd(sprr8y) + 0xee
2024-06-06 13:33:19 at sprq9a.spra(Stream, ImageFormat, sprr8y) + 0x96
2024-06-06 13:33:19 at sprrrp.sprb(sprrt2) + 0x1dc
2024-06-06 13:33:19 at sprrrp.sprb(Stream, sprrt2, String, HTMLOptions) + 0xeb
2024-06-06 13:33:19 at Spire.Xls.Core.Spreadsheet.XlsWorksheet.SaveToHtml(String, HTMLOptions) + 0x2fb
2024-06-06 13:33:19 at Spire.Xls.AOT.NLXlsWorksheet.XlsWorksheet_SaveToHtmlF(IntPtr, IntPtr, IntPtr) + 0x73
2024-06-06 13:33:19
2024-06-06 13:33:19 at ChildProcess.exithandler (node:child_process:419:12)
2024-06-06 13:33:19 at ChildProcess.emit (node:events:513:28)
2024-06-06 13:33:19 at maybeClose (node:internal/child_process:1091:16)
2024-06-06 13:33:19 at ChildProcess._handle.onexit (node:internal/child_process:302:5)

PDF to Html

Code: Select all

from spire.pdf.common import *
from spire.pdf import *
from sys import *
import shutil

outputFilePath = sys.argv[2] + '/' + os.path.splitext(os.path.basename(sys.argv[1]))[0] + '.html'

# Create an object of the PdfDocument class
document = PdfDocument()

# Load a PDF document
document.LoadFromFile(sys.argv[1])
# # document.ConvertOptions.SetPdfToHtmlOptions(False)
document.ConvertOptions.SetPdfToHtmlOptions(False, True, 1, False)

# Save to HTML
document.SaveToFile(outputFilePath, FileFormat.HTML)
document.Close()

shutil.make_archive(sys.argv[2], 'zip', sys.argv[2])



Issues faced :

2024-06-06 13:35:48 Error: Command failed: python pythonScript/convertPDFToHTML.py "conversions/input/451547683400/temp 1.pdf" "conversions/output/451547683400"
2024-06-06 13:35:48 Traceback (most recent call last):
2024-06-06 13:35:48 File "C:\usr\src\app\pythonScript\convertPDFToHTML.py", line 17, in <module>
2024-06-06 13:35:48 document.SaveToFile(outputFilePath, FileFormat.HTML)
2024-06-06 13:35:48 File "C:\Program Files\Python39\lib\site-packages\plum\function.py", line 642, in __call__
2024-06-06 13:35:48 return self.f(self.instance, *args, **kw_args)
2024-06-06 13:35:48 File "C:\Program Files\Python39\lib\site-packages\plum\function.py", line 592, in __call__
2024-06-06 13:35:48 return _convert(method(*args, **kw_args), return_type)
2024-06-06 13:35:48 File "C:\Program Files\Python39\lib\site-packages\spire\pdf\PdfDocument.py", line 287, in SaveToFile
2024-06-06 13:35:48 CallCFunction(GetDllLibPdf().PdfDocument_SaveToFileFF,self.Ptr, filename,enumfileFormat)
2024-06-06 13:35:48 File "C:\Program Files\Python39\lib\site-packages\spire\pdf\common\__init__.py", line 109, in CallCFunction
2024-06-06 13:35:48 raise SpireException(info)
2024-06-06 13:35:48 spire.pdf.common.SpireException: TypeInitialization_Type_NoTypeAvailable: at System.Runtime.CompilerServices.ClassConstructorRunner.EnsureClassConstructorRun(StaticClassConstructionContext*) + 0x167
2024-06-06 13:35:48 at System.Runtime.CompilerServices.ClassConstructorRunner.CheckStaticClassConstructionReturnNonGCStaticBase(StaticClassConstructionContext*, IntPtr) + 0xd
2024-06-06 13:35:48 at sprf3s.sprd() + 0xc1
2024-06-06 13:35:48 at sprf3s.spra(String, String, Boolean, Boolean) + 0x4d9
2024-06-06 13:35:48 at sprf03.spra(sprauh) + 0x46e
2024-06-06 13:35:48 at spreck.sprb() + 0x1fb
2024-06-06 13:35:48 at sprecj.spra(PdfDocumentBase, String, Boolean, Boolean) + 0xd9
2024-06-06 13:35:48 at Spire.Pdf.PdfDocumentBase.SaveToHtml(String) + 0xb3
2024-06-06 13:35:48 at Spire.Pdf.AOT.NLPdfDocument.PdfDocument_SaveToFileFF(IntPtr, IntPtr, Int32, IntPtr) + 0x77
2024-06-06 13:35:48
2024-06-06 13:35:48 at ChildProcess.exithandler (node:child_process:419:12)
2024-06-06 13:35:48 at ChildProcess.emit (node:events:513:28)
2024-06-06 13:35:48 at maybeClose (node:internal/child_process:1091:16)
2024-06-06 13:35:48 at ChildProcess._handle.onexit (node:internal/child_process:302:5)

I would greatly appreciate any guidance or instructions you can provide to help me successfully convert the documents to Html.

Also note I have included the SkiaSharp graphics library too into the root directory of the application viz : C:\usr\src\app

Thank you very much for your attention and assistance. I look forward to your prompt response.

petchi_y
 
Posts: 7
Joined: Tue May 14, 2024 5:56 am

Thu Jun 06, 2024 9:53 am

Hello,

Thanks for your inquiry.
Sorry, we don't have the environment you mentioned, so we can't verify your issue for the time being. I will build the environment and test your case as soon as possible and feedback the results to you. Thank you for your understanding.

Sincerely,
William
E-iceblue support team
User avatar

William.Zhang
 
Posts: 732
Joined: Mon Dec 27, 2021 2:23 am

Tue Jun 11, 2024 8:04 am

Hello,

Thanks for your patience.
We have conducted further investigation and testing but failed to reproduce your issue. Since we are not very familiar with node.js, we have removed the relevant content in your Dockerfile and only kept the key parts. Here I have uploaded my test demo, test documents and whl file of our product. Please download it and follow the steps below to test it in your environment.
Code: Select all
1.docker build -t myapp .
2.docker run -itd --name myapp myapp

If everything goes well, two converted HTML files will be generated after the container runs. You can use "docker cp myapp:pdftohtml.html ." and "docker cp myapp:exceltohtml.html ."to copy them out of the container. Looking forward to your feedback.

Sincerely,
William
E-iceblue support team
User avatar

William.Zhang
 
Posts: 732
Joined: Mon Dec 27, 2021 2:23 am

Thu Jun 13, 2024 4:43 pm

Hi,

The above docker configuration is working as expected both for the demo one and the actual application in NodeJS.
We are able to convert both excel and pdf to html.

Thank you for your response & support.

petchi_y
 
Posts: 7
Joined: Tue May 14, 2024 5:56 am

Fri Jun 14, 2024 2:09 am

Hello,

Thanks for your feedback.
Glad to hear that news, if you have any other questions, please feel free to write back.

Sincerely,
William
E-iceblue support team
User avatar

William.Zhang
 
Posts: 732
Joined: Mon Dec 27, 2021 2:23 am

Return to Spire.PDF