I'm using spire.pdf 12.5.8 in my .net 8 AWS lambda function to extract text from a pdf like below.
- Code: Select all
var pdf = new PdfDocument();
pdf.LoadFromStream(stream);
if (pdf != null)
{
var builder = new StringBuilder();
var totalPages = pdf.Pages.Count;
var options = new PdfTextExtractOptions
{
IsExtractAllText = true,
IsShowHiddenText = false
};
for (int pageNo = 0; pageNo < totalPages; pageNo++)
{
var page = pdf.Pages[pageNo];
var pdfTextExtractor = new PdfTextExtractor(page);
var rawText = pdfTextExtractor.ExtractText(options);
var textLines = rawText.Replace("copy", " ", StringComparison.InvariantCultureIgnoreCase).Split("\r\n", StringSplitOptions.RemoveEmptyEntries);
builder.AppendLine(string.Join("\r\n", textLines));
}
var extractedText = builder.ToString();
}
This code works as expected when I run it locally on windows but when I deploy it to AWS I get this error.
Exception 'The type initializer for 'Gdip' threw an exception.', Stack trace ' at System.Drawing.SafeNativeMethods.Gdip.GdipCreatePath(FillMode brushMode, IntPtr& path)
at System.Drawing.Drawing2D.GraphicsPath..ctor()
at spr届..ctor()
at spr뗀.㓡(spr椵 A_0, FillMode A_1)
at spr뗀.㾪()
at spr뗀.㾪()
at spr랶.㓡(spr樰 A_0)
at spr랶.㓡()
at spr냙.㺯()
at Spire.Pdf.Texts.PdfTextExtractor.㓡(PdfTextExtractOptions A_0)
at Spire.Pdf.Texts.PdfTextExtractor.ExtractText(PdfTextExtractOptions options)
at Lambdas.Services.PDFProcessingService.ProcessPDF(String message) in C:\source\app\Lambdas\Services\PDFProcessingService.cs:line 101
at Lambdas.Services.PDFProcessingService.ProcessPDF(String message) in C:\source\app\Lambdas\Services\PDFProcessingService.cs:line 119
at Lambdas.Functions.ProcessPDFInvoice(SQSEvent ev, ILambdaContext lambdaContext) in C:\source\app\Lambdas\Functions.cs:line 74', Source 'System.Drawing.Common'
I don't actually need anything image related from the pdf. Just need to extract all the text from the pdf. Not sure why Spire throws this error when extracting text. Could you please advise if there is a way to extract the text I need, without causing this error in AWS Lambda function (Linux environment)?
Regards,
Swathi