Spire.Doc is a professional Word .NET library specifically designed for developers to create, read, write, convert and print Word document files. Get free and professional technical support for Spire.Doc for .NET, Java, Android, C++, Python.

Wed Jul 29, 2020 2:47 pm

Hello,

It looks like the library leaks memory. I am testing simultaneous document rendering on about 2000 Word documents and the memory keeps growing. I am using a simple code like

using (Document document = new Document())
{
document.LoadFromFile(filePath);

using (System.Drawing.Image img = document.SaveToImages(0, ImageType.Metafile))
{
img.Save(targetFile, System.Drawing.Imaging.ImageFormat.Png);
}
}

and also regularly call

GC.Collect();
GC.WaitForPendingFinalizers();
GC.Collect();
GC.WaitForPendingFinalizers();

However, this does not help and memory steadily growing with more rendering threads added to the app.

YuriGubanov
 
Posts: 6
Joined: Wed Jul 29, 2020 8:55 am

Thu Jul 30, 2020 10:30 am

Hello Yuri,

Thanks for your post.
We have some static read-only objects, likes font data, when the first document is loaded, the retrieved font data will be placed into memory, so that other documents can use it directly without spending any more time to read it from system.

In your code, only the first page of the document is converted, and only the first image object is disposed. Please note that Word document is flow structure, which has no page concept. Even if you only convert the first page of your 10-page document to image, our product will parse and calculate the entire document content to determine the content of each page, so there will be 10 image objects are placed in memory. I suggest that please adjust your code to convert the entire document, and dispose each object after the image is saved to file.
Sample code:
Code: Select all
   using (Document document = new Document())
                {
                    using (Stream stream = File.OpenRead(filePath))
                    {
                        document.LoadFromStream(stream, FileFormat.Auto);
                    }

                    string targetFile = Path.GetFileName(Path.GetTempFileName()) + ".png";

                    System.Drawing.Image[] images = document.SaveToImages(ImageType.Metafile);
                    for (int i = 0; i < images.Length; i++)
                    {
                        images[i].Save(targetFile, System.Drawing.Imaging.ImageFormat.Png);
                        images[i].Dispose();
                    }
                    images = null;
                }

I attached mytest result, Memory.png. Here is the dlls list I used.
Spire.Doc.dll 8.7.5.4046
Spire.Pdf.dll 6.7.6.2046
Spire.License.dll 1.3.8.46


Sincerely,
Amy
E-iceblue support team
User avatar

amy.zhao
 
Posts: 2774
Joined: Wed Jun 27, 2012 8:50 am

Thu Jul 30, 2020 11:18 am

1. The library method should NOT have side effects, such as not releasing memory. I am only getting first page and I am releasing it, why the library holds references to object not referenced by me?
2. Dispose must free all not referenced resources rather than holding them, what is especially important in 32-bit applications like your library
3. If you generate all images in one, this is also an unwanted side effect. Imagine 1000-page document which I need only first page for. First, your library will crash with OutOfMemory even if I ask only first page image. Second, even if it doesn't, I will have to wait until all the document is converted! This is very unlogical side effect, given that your library explicitly asks for page number
4. Finally, memory is leaked even without conversion of the document to picture.

YuriGubanov
 
Posts: 6
Joined: Wed Jul 29, 2020 8:55 am

Thu Jul 30, 2020 12:01 pm

Hi,

As I understand the phenomenon of memory leakage is that the memory keeps increasing as the program runs, but in my test result(attached Memory.png), you can see that this is not the case, memory is being disposed regularly(I don’t call GC.Collect() externally).

I tested and converted your document 1000 times and there was no problem occurs. In addition, our products support 64bit and 32bit. It is not recommended to create a 32-bit application if you are working with a lot of documents.

As the special structure of Word document itself, we have to do this. Assuming that you need to convert the second page or some other pages, how do we determine the content of the specified page if we don't compute the entire document or isolate each page’s content?

When the document is loaded, the static read-only objects are generated. But the static read-only objects just are some necessary data that all documents can share. In general, it takes up very little memory.

Sincerely,
Amy
E-iceblue support team
User avatar

amy.zhao
 
Posts: 2774
Joined: Wed Jun 27, 2012 8:50 am

Thu Jul 30, 2020 12:16 pm

You cannot convert second page of a document which is disposed, so it is a strange point. If a document is disposed, you have to delete whatever objects you generated for it's processing.

In my experiments, just loading and immediately disposing random documents from my own drive consumes more than 2Gb for 1200 documents (all are different) and this no way can be called "a very little memory".

YuriGubanov
 
Posts: 6
Joined: Wed Jul 29, 2020 8:55 am

Fri Jul 31, 2020 5:36 am

Hi Yuri,

For this point
As the special structure of Word document itself, we have to do this. Assuming that you need to convert the second page or some other pages, how do we determine the content of the specified page if we don't compute the entire document or isolate each page’s content?

It seems to have caused some misunderstanding for you, I am sorry for that. I mean, because of Word document is flow structure without page structure, when performing the conversion, we have to need to calculate the content of entire document and isolate the content of each page, and then put each page's content as an object in memory.
Just like when you open a Word document with MS Word tool, MS Word will also calculate the entire document's content and do paging so that we can see the content of each page.
For "You cannot convert second page of a document which is disposed" your said, this is absolutely true. We certainly can't use a document which is disposed to do the conversion or other document operations.

I did several tests with your sample.docx. Memory always stays at about 69MB. Attached my test result and test code.
In addition, a 64-bit application which needs to be set its target platform as x64.

My test environment is Windows Server 2016 64bit RAM:4G and Visual Studio Enterprise 2019 v16.4.2. I used Spire dlls from Spire.Office 5.7.3, https://www.e-iceblue.com/downloads/hot ... _5.7.3.zip.
Cloud you please tell me your environment?

Sincerely,
Amy
E-iceblue support team
User avatar

amy.zhao
 
Posts: 2774
Joined: Wed Jun 27, 2012 8:50 am

Fri Jul 31, 2020 10:03 am

I have shared my test project with you and video which clearly shows memory consumption up to 3Gb. Can't post it here but hope you can test on your side.

YuriGubanov
 
Posts: 6
Joined: Wed Jul 29, 2020 8:55 am

Fri Jul 31, 2020 10:41 am

Hi Yuri,

I have received your project. Thank you very much for your providing. I will download it and do some tests, and then get back to you next Monday.

Sincerely,
Amy
E-iceblue support team
User avatar

amy.zhao
 
Posts: 2774
Joined: Wed Jun 27, 2012 8:50 am

Mon Aug 03, 2020 11:31 am

Hi Yuri,

Sorry for the delay response. Due to our poor network, it was very slow for me to download your files. I have successfully downloaded project.rar and spire_memory_leaks.wmv.
But unfortunately, we cannot play spire_memory_leaks.wmv video here(I tried Windows Media Player and other players, but none of them can play it). Could you please send it again?
I ran your project and could see a lot of memory consumption issue, attached my result. I have forwarded the issue to our Dev team for further investigation, and will tell you once there is any news.

Sincerely,
Amy
E-iceblue support team
User avatar

amy.zhao
 
Posts: 2774
Joined: Wed Jun 27, 2012 8:50 am

Mon Aug 03, 2020 11:48 am

amy.zhao wrote:Hi Yuri,

Sorry for the delay response. Due to our poor network, it was very slow for me to download your files. I have successfully downloaded project.rar and spire_memory_leaks.wmv.
But unfortunately, we cannot play spire_memory_leaks.wmv video here(I tried Windows Media Player and other players, but none of them can play it). Could you please send it again?
I ran your project and could see a lot of memory consumption issue, attached my result. I have forwarded the issue to our Dev team for further investigation, and will tell you once there is any news.

Sincerely,
Amy
E-iceblue support team


Since you are reproduced the leak, there is no more need in the video. Hope that the team can fix the leak.

YuriGubanov
 
Posts: 6
Joined: Wed Jul 29, 2020 8:55 am

Tue Aug 04, 2020 11:07 am

Hi Yuri,

Thanks for your detailed information from email.
We have made repeated confirmation about objects dispose. Please rest assured that our real Dispose method of Document can close and dispose all objects expect that are marked as static read-only.

Our developer has finished the investigation for a lot of memory consumption issue when running your project.rar. Memory is consumed primarily by voina-i-mir.docx. The document has 1918 pages with a lot of data. When we load document, we need to expand all the elements of the document (structure object, format, etc.) in order to parse and store them in memory, so memory consumption increases a lot in a short time because a lot of data needs to be processed, the document consumes about 2GB memory. After using{}, the document data can be disposed in memory. I attached my test result, please have a look at included-voina-i-mir.docx.zip, as it runs in multiple threads, so the memory consumption reduction does not seem obvious but memory consumption does decrease and it doesn't increase all the time.

If I deleted voina-i-mir.docx file and just tested other five files, the memory consumption is low, attached not included -voina-i-mir.docx.zip.

I also discussed about improving the memory consumption for voina-i-mir.docx file with our developer. For our product’s current model structure, it's hard to do significant optimizations. We need to change our model structure and do a new structure to get significant improvement but it cannot be done in a short time as its complexity and difficulty. I will inform you once we finish it in the future.

Sincerely,
Amy
E-iceblue support team
User avatar

amy.zhao
 
Posts: 2774
Joined: Wed Jun 27, 2012 8:50 am

Tue Aug 04, 2020 11:22 am

amy.zhao wrote:Hi Yuri,

Thanks for your detailed information from email.
We have made repeated confirmation about objects dispose. Please rest assured that our real Dispose method of Document can close and dispose all objects expect that are marked as static read-only.

Our developer has finished the investigation for a lot of memory consumption issue when running your project.rar. Memory is consumed primarily by voina-i-mir.docx. The document has 1918 pages with a lot of data. When we load document, we need to expand all the elements of the document (structure object, format, etc.) in order to parse and store them in memory, so memory consumption increases a lot in a short time because a lot of data needs to be processed, the document consumes about 2GB memory. After using{}, the document data can be disposed in memory. I attached my test result, please have a look at included-voina-i-mir.docx.zip, as it runs in multiple threads, so the memory consumption reduction does not seem obvious but memory consumption does decrease and it doesn't increase all the time.

If I deleted voina-i-mir.docx file and just tested other five files, the memory consumption is low, attached not included -voina-i-mir.docx.zip.

I also discussed about improving the memory consumption for voina-i-mir.docx file with our developer. For our product’s current model structure, it's hard to do significant optimizations. We need to change our model structure and do a new structure to get significant improvement but it cannot be done in a short time as its complexity and difficulty. I will inform you once we finish it in the future.

Sincerely,
Amy
E-iceblue support team


Thanks, Amy and it is sad that the document causes troubles. In my tests I see similar issues when I just open multiple documents in multiple threads - it eventually leads to OutOfMemory. Hope that you can improve your model structure to handle that.

YuriGubanov
 
Posts: 6
Joined: Wed Jul 29, 2020 8:55 am

Wed Aug 05, 2020 7:59 am

Hi Yuri,

You're welcome. I will get back to you immediately once there is any good news.

Sincerely,
Amy
E-iceblue support team
User avatar

amy.zhao
 
Posts: 2774
Joined: Wed Jun 27, 2012 8:50 am

Return to Spire.Doc