I am currently working on a project that requires the usage of the Spire.XLS & Spire.PDF (Spire.Office) library (python based) in a docker container based on windows. I have encountered some difficulties while attempting to convert excel/pdf to html and would greatly appreciate your assistance.
The specifications of my environment are as follows:
Docker Container's Operating System : Windows
Python Version : 3.9.10
Spire.Office (Python) Version : 9.1.0
NodeJS Version : 18.16.0
I attempted to convert the excel/pdf documents to html, but I encountered issues and was unable to succeed. Following is the docker command which is being used to create a docker image based on windows.
- Code: Select all
# Use the official Node.js image as the base image for the builder
FROM mcr.microsoft.com/windows/servercore:ltsc2019 AS builder
ENV NODE_VERSION 18.16.0
ENV NODE_DOWNLOAD_URL https://nodejs.org/dist/v${NODE_VERSION}/node-v${NODE_VERSION}-win-x64.zip
RUN mkdir C:\nodejs
RUN powershell -Command \
Invoke-WebRequest -Uri %NODE_DOWNLOAD_URL% -OutFile nodejs.zip; \
Expand-Archive -Path nodejs.zip -DestinationPath C:\nodejs; \
Remove-Item -Force nodejs.zip
RUN setx /M PATH "%PATH%;C:\nodejs\node-v%NODE_VERSION%-win-x64"
USER ContainerAdministrator
# Set the working directory
WORKDIR /usr/src/app
# Copy package files and install dependencies
COPY package*.json ./
# Copy the rest of the application source code
COPY . .
RUN npm install
# Build the application and prune dev dependencies
RUN npm run build && npm prune --production
# Use the official Node.js image as the base image for the final container
FROM mcr.microsoft.com/windows/servercore:ltsc2019
# Set environment variable for production
ENV NODE_ENV=production
# Set the working directory
WORKDIR /usr/src/app
# Create a logs directory with appropriate permissions
RUN mkdir C:\usr\src\app\logs
# Install required tools and libraries
RUN powershell -Command \
Add-WindowsFeature Web-Server; \
Invoke-WebRequest -Uri https://aka.ms/vs/17/release/vs_BuildTools.exe -OutFile vs_buildtools.exe; \
Start-Process -FilePath vs_buildtools.exe -ArgumentList '--quiet --wait --norestart --add Microsoft.VisualStudio.Workload.VCTools' -NoNewWindow -Wait; \
Remove-Item -Force vs_buildtools.exe
# Copy dependencies and built application from the builder stage
COPY --from=builder /usr/src/app/node_modules ./node_modules
COPY --from=builder /usr/src/app/dist ./dist
COPY /pythonScript/ ./pythonScript/
# Debug step to list contents of the dist directory
RUN dir C:\usr\src\app\dist
# Install Python and dependencies
RUN powershell -Command \
Invoke-WebRequest -Uri https://www.python.org/ftp/python/3.9.10/python-3.9.10-amd64.exe -OutFile python-installer.exe; \
Start-Process python-installer.exe -ArgumentList '/quiet InstallAllUsers=1 PrependPath=1' -NoNewWindow -Wait; \
Remove-Item -Force python-installer.exe
RUN powershell -Command \
Invoke-WebRequest -Uri https://www.e-iceblue.com/downloads/lib/libSkiaSharp.dylib -OutFile libSkiaSharp.dylib;
# Install pip requirements
COPY --from=builder /usr/src/app/pythonScript/requirements.txt ./pythonScript/requirements.txt
RUN pip install --no-cache-dir -r pythonScript/requirements.txt
ENV NODE_VERSION 18.16.0
ENV NODE_DOWNLOAD_URL https://nodejs.org/dist/v${NODE_VERSION}/node-v${NODE_VERSION}-win-x64.zip
RUN mkdir C:\nodejs
RUN powershell -Command \
Invoke-WebRequest -Uri %NODE_DOWNLOAD_URL% -OutFile nodejs.zip; \
Expand-Archive -Path nodejs.zip -DestinationPath C:\nodejs; \
Remove-Item -Force nodejs.zip
RUN setx /M PATH "%PATH%;C:\nodejs\node-v%NODE_VERSION%-win-x64"
# Command to run the application
CMD ["node", "dist/main.js"]
Following are the conversion code for each types :
Excel to Html
- Code: Select all
from spire.xls import *
from spire.xls.common import *
from spire.pdf import *
from sys import *
import shutil
outputFilePath = sys.argv[2] + '/' + os.path.splitext(os.path.basename(sys.argv[1]))[0] + '.HTML'
#create a workbook
workbook = Workbook()
#load a excel document
workbook.LoadFromFile(sys.argv[1])
sheet_count = workbook.Worksheets.Count
for i in range(sheet_count):
# try:
sheet = workbook.Worksheets[i]
if sheet.Visibility == WorksheetVisibility.Visible:
fileName = sys.argv[2] + '/' +sheet.Name + '.html'
row_count = sheet.Rows.Length;
col_count = sheet.Columns.Length;
count = 1
lst_hide_row = []
lst_hide_row = []
while (count <= row_count):
if(sheet.IsRowVisible(count)==False):
#print(count)
lst_hide_row.append(count)
count = count + 1
if(len(lst_hide_row)>0):
lst_hide_row = sorted(lst_hide_row,reverse=True)
#print(lst_hide_row)
for x in lst_hide_row:
sheet.DeleteRow(x,1)
row_count = sheet.Rows.Length;
# print(row_count)
# print(lst_hide_row)
sheet.SaveToHtml(fileName)
# except Exception as error:
# print("An error occurred:", type(error).__name__, "–", error)
# print("Sheet "+str(i+1)+" not converted")
workbook.ConverterSetting.SheetFitToPage = True
workbook.Dispose()
#os.remove(outputFilePath)
shutil.make_archive(sys.argv[2], 'zip', sys.argv[2])
Issues faced :
2024-06-06 13:33:19 Error: Command failed: python pythonScript/convertExcelToHTML.py "conversions/input/300528754100/temp 2.xlsm" "conversions/output/300528754100"
2024-06-06 13:33:19 Traceback (most recent call last):
2024-06-06 13:33:19 File "C:\usr\src\app\pythonScript\convertExcelToHTML.py", line 40, in <module>
2024-06-06 13:33:19 sheet.SaveToHtml(fileName)
2024-06-06 13:33:19 File "C:\Program Files\Python39\lib\site-packages\plum\function.py", line 642, in __call__
2024-06-06 13:33:19 return self.f(self.instance, *args, **kw_args)
2024-06-06 13:33:19 File "C:\Program Files\Python39\lib\site-packages\plum\function.py", line 592, in __call__
2024-06-06 13:33:19 return _convert(method(*args, **kw_args), return_type)
2024-06-06 13:33:19 File "C:\Program Files\Python39\lib\site-packages\spire\xls\XlsWorksheet.py", line 1114, in SaveToHtml
2024-06-06 13:33:19 CallCFunction(GetDllLibXls().XlsWorksheet_SaveToHtmlF, self.Ptr, filename)
2024-06-06 13:33:19 File "C:\Program Files\Python39\lib\site-packages\spire\xls\common\__init__.py", line 109, in CallCFunction
2024-06-06 13:33:19 raise SpireException(info)
2024-06-06 13:33:19 spire.xls.common.SpireException: TypeInitialization_Type_NoTypeAvailable: at System.Runtime.CompilerServices.ClassConstructorRunner.EnsureClassConstructorRun(StaticClassConstructionContext*) + 0x167
2024-06-06 13:33:19 at System.Runtime.CompilerServices.ClassConstructorRunner.CheckStaticClassConstructionReturnGCStaticBase(StaticClassConstructionContext*, Object) + 0xd
2024-06-06 13:33:19 at sprq9b..ctor(sprray, sprrca) + 0x1a
2024-06-06 13:33:19 at sprq9a.sprd(sprr8y) + 0xee
2024-06-06 13:33:19 at sprq9a.spra(Stream, ImageFormat, sprr8y) + 0x96
2024-06-06 13:33:19 at sprrrp.sprb(sprrt2) + 0x1dc
2024-06-06 13:33:19 at sprrrp.sprb(Stream, sprrt2, String, HTMLOptions) + 0xeb
2024-06-06 13:33:19 at Spire.Xls.Core.Spreadsheet.XlsWorksheet.SaveToHtml(String, HTMLOptions) + 0x2fb
2024-06-06 13:33:19 at Spire.Xls.AOT.NLXlsWorksheet.XlsWorksheet_SaveToHtmlF(IntPtr, IntPtr, IntPtr) + 0x73
2024-06-06 13:33:19
2024-06-06 13:33:19 at ChildProcess.exithandler (node:child_process:419:12)
2024-06-06 13:33:19 at ChildProcess.emit (node:events:513:28)
2024-06-06 13:33:19 at maybeClose (node:internal/child_process:1091:16)
2024-06-06 13:33:19 at ChildProcess._handle.onexit (node:internal/child_process:302:5)
PDF to Html
- Code: Select all
from spire.pdf.common import *
from spire.pdf import *
from sys import *
import shutil
outputFilePath = sys.argv[2] + '/' + os.path.splitext(os.path.basename(sys.argv[1]))[0] + '.html'
# Create an object of the PdfDocument class
document = PdfDocument()
# Load a PDF document
document.LoadFromFile(sys.argv[1])
# # document.ConvertOptions.SetPdfToHtmlOptions(False)
document.ConvertOptions.SetPdfToHtmlOptions(False, True, 1, False)
# Save to HTML
document.SaveToFile(outputFilePath, FileFormat.HTML)
document.Close()
shutil.make_archive(sys.argv[2], 'zip', sys.argv[2])
Issues faced :
2024-06-06 13:35:48 Error: Command failed: python pythonScript/convertPDFToHTML.py "conversions/input/451547683400/temp 1.pdf" "conversions/output/451547683400"
2024-06-06 13:35:48 Traceback (most recent call last):
2024-06-06 13:35:48 File "C:\usr\src\app\pythonScript\convertPDFToHTML.py", line 17, in <module>
2024-06-06 13:35:48 document.SaveToFile(outputFilePath, FileFormat.HTML)
2024-06-06 13:35:48 File "C:\Program Files\Python39\lib\site-packages\plum\function.py", line 642, in __call__
2024-06-06 13:35:48 return self.f(self.instance, *args, **kw_args)
2024-06-06 13:35:48 File "C:\Program Files\Python39\lib\site-packages\plum\function.py", line 592, in __call__
2024-06-06 13:35:48 return _convert(method(*args, **kw_args), return_type)
2024-06-06 13:35:48 File "C:\Program Files\Python39\lib\site-packages\spire\pdf\PdfDocument.py", line 287, in SaveToFile
2024-06-06 13:35:48 CallCFunction(GetDllLibPdf().PdfDocument_SaveToFileFF,self.Ptr, filename,enumfileFormat)
2024-06-06 13:35:48 File "C:\Program Files\Python39\lib\site-packages\spire\pdf\common\__init__.py", line 109, in CallCFunction
2024-06-06 13:35:48 raise SpireException(info)
2024-06-06 13:35:48 spire.pdf.common.SpireException: TypeInitialization_Type_NoTypeAvailable: at System.Runtime.CompilerServices.ClassConstructorRunner.EnsureClassConstructorRun(StaticClassConstructionContext*) + 0x167
2024-06-06 13:35:48 at System.Runtime.CompilerServices.ClassConstructorRunner.CheckStaticClassConstructionReturnNonGCStaticBase(StaticClassConstructionContext*, IntPtr) + 0xd
2024-06-06 13:35:48 at sprf3s.sprd() + 0xc1
2024-06-06 13:35:48 at sprf3s.spra(String, String, Boolean, Boolean) + 0x4d9
2024-06-06 13:35:48 at sprf03.spra(sprauh) + 0x46e
2024-06-06 13:35:48 at spreck.sprb() + 0x1fb
2024-06-06 13:35:48 at sprecj.spra(PdfDocumentBase, String, Boolean, Boolean) + 0xd9
2024-06-06 13:35:48 at Spire.Pdf.PdfDocumentBase.SaveToHtml(String) + 0xb3
2024-06-06 13:35:48 at Spire.Pdf.AOT.NLPdfDocument.PdfDocument_SaveToFileFF(IntPtr, IntPtr, Int32, IntPtr) + 0x77
2024-06-06 13:35:48
2024-06-06 13:35:48 at ChildProcess.exithandler (node:child_process:419:12)
2024-06-06 13:35:48 at ChildProcess.emit (node:events:513:28)
2024-06-06 13:35:48 at maybeClose (node:internal/child_process:1091:16)
2024-06-06 13:35:48 at ChildProcess._handle.onexit (node:internal/child_process:302:5)
I would greatly appreciate any guidance or instructions you can provide to help me successfully convert the documents to Html.
Also note I have included the SkiaSharp graphics library too into the root directory of the application viz : C:\usr\src\app
Thank you very much for your attention and assistance. I look forward to your prompt response.