Newbie question - Spire.Doc documentation?

Mon Oct 14, 2024 4:56 pm

I'm a trial user of Spire modules.

In Spire.Doc, the code loops through tables of a doc section, how to get each table's title?

Also, is there documentation of Spire.Doc to tell us all the possible objects (or items) and their usage (like sample code of how to set/get them)?

TIA

Tue Oct 15, 2024 6:29 am

Hello,

Thanks for your inquiry. You can refer to the following code to obtain the title of each table. Additionally, you can learn about Spire.Doc's api through this link(https://www.e-iceblue.com/misc/apireference.html). If you want to refer to examples, you can download 'Spire.Doc Pack' from this link(https://www.e-iceblue.com/Download/down ... t-now.html) or refer to this link(https://www.e-iceblue.com/Tutorials/Spi ... ntent.html). If you did not successfully obtain the title, please provide us with your input Word document for investigation. You can upload here or send it to us via email（ [email protected] ). Thank you in advance.

Code: Select all: Document document = new Document(@"TableSample.docx"); foreach (Section section in document.Sections) { foreach (DocumentObject documentObject in section.Body.ChildObjects) { if (documentObject is Table) { Table table = (documentObject as Table); Console.WriteLine(table.Title); } } }

Sincerely,
Amin
E-iceblue support team

Tue Oct 15, 2024 2:26 pm

Thank you for your response!

I forgot to mention that I'm using Python. When I tried the following test code, I got empty output -

Code: Select all: # Get a table table = section.Tables.get_Item(j) print (table.Title)

Did I do anything wrong?

Wed Oct 16, 2024 1:42 am

Hello,

Thanks for your feedback. I tested it using the following code and everything worked fine. If you are not using the latest version(Spire.Doc for Python Version:12.7.1), please update and try again. If the issue persists after the update, please provide us with your input Word document for investigation. You can upload here or send it to us via email（ [email protected] ). Thank you in advance.

Code: Select all: # Create a Document object doc = Document() # Load input document doc.LoadFromFile("TableSample.docx") for s in range(doc.Sections.Count): # Get a section section = doc.Sections.get_Item(s) # Get tables from the section tables = section.Tables for i in range(0, tables.Count): print (tables.get_Item(i).Title)

Sincerely,
Amin
E-iceblue support team

Mon Oct 28, 2024 7:51 pm

Thank you for your previous response, I got account (login) issue so my reply is delayed -

I'm uploading a testing doc file I randomly threw some contents and tables.

Here are my challenges using your modules -

Thank you!

Tue Oct 29, 2024 7:29 am

Hello,

Thanks for your feedback. Based on the Word document you provided, I have adjusted the following code for your reference on how to obtain the titles in the table. Additionally, based on the structural characteristics of Word documents, we can only retrieve tables from each section and perform corresponding processing. Finally, do you mean to merge the tables that were divided into three parts from the original document into one Excel document? If I misunderstood your requirements, please share more details for reference, thanks in advance.

Code: Select all: # Create a Document object doc = Document() # Load input document doc.LoadFromFile("tableName.docx") # Traverse every section in the document for i in range(doc.Sections.Count): section = doc.Sections.get_Item(i) # Collection of Tables in the Section tables = section.Tables # Traverse the collection of tables for index in range(tables.Count): # Retrieve the index of the section where the table is located tableIndex = section.Body.ChildObjects.IndexOf(tables[index]) # Retrieve the previous paragraph of the table documentObject = section.Body.ChildObjects.get_Item(tableIndex-1) if isinstance(documentObject, Paragraph): paragraph = Paragraph(documentObject) # print title print(paragraph.Text)

Sincerely,
Amin
E-iceblue support team

Tue Oct 29, 2024 2:45 pm

Thank you Amin for your prompt response!

It's interesting to use 'paragraph' object to get the table title, but good to know this.

Yes, I would like to combine those adjacent tables with 1/3, 2/3, and 3/3.

I'm attaching a new sample doc (with a single table added), and a screenshot of the tables I would like to have in the output excel book. In this new doc, I'm expecting to get 8 tables (marked in the screenshot).

Can you provide code snip to do that?

Thank you!

Wed Oct 30, 2024 9:42 am

Hello,

Thanks for your inquiry. You can refer to the following code to use Spire.doc andSpire.xls to write data from Word tables into Excel file. If you have any further questions, please feel free to provide feedback.

Code: Select all: doc = Document() doc.LoadFromFile("tableName.docx") #Create a workbook object wb = Workbook() #Clear all worksheets in the workbook wb.Worksheets.Clear() #Create an empty worksheet in the workbook worksheet = wb.CreateEmptySheet() row = 1 column = 1 def ExportTableInExcel(worksheet, start_row, table): row = start_row for index in range(table.Rows.Count): column = 1 for k in range(table.Rows[index].Cells.Count): tb_cell=table.Rows[index].Cells[k] CopyContentInTable(tb_cell, worksheet.Range[row, column]); column += 1 row += 1 return row def CopyContentInTable(tbCell, cell): newPara = Paragraph(tbCell.Document) # Traverse sub objects of table cells for i in range(tbCell.Count): # Retrieve the current sub object documentObject = tbCell.ChildObjects[i] # If the sub object is of paragraph type if isinstance( documentObject, Paragraph): paragraph = Paragraph(documentObject) for cObj in range(paragraph.ChildObjects.Count): # Clone and add sub objects to a new paragraph newPara.ChildObjects.Add(paragraph.ChildObjects.get_Item(cObj).Clone()) # If it's not the last child object, add a line break if i < tbCell.ChildObjects.Count - 1: newPara.AppendText("\n") CopyText(cell, newPara) def CopyText(cell, paragraph): richText = cell.RichText richText.Text = paragraph.Text def getTitle(table): tableIndex = section.Body.ChildObjects.IndexOf(table) documentObject = section.Body.ChildObjects.get_Item(tableIndex-1) if isinstance(documentObject, Paragraph): paragraph = Paragraph(documentObject) return paragraph.Text #Determine whether to add a new sheet def groupTitle(tableName): if tableName.endswith("(1/3)"): return True else: return False # Traverse every section in the document for i in range(doc.Sections.Count): section = doc.Sections.get_Item(i) tables = section.Tables for index in range(tables.Count): # Get the table table = tables[index] if isinstance(tables[index], Table) else None title= getTitle(table) bool = groupTitle(title) if(bool==True): worksheet = wb.CreateEmptySheet() row = 1 column = 1 if(index>0): currentRow = ExportTableInExcel(worksheet,row, table) # Update row counter row = currentRow #Automatically adjust the width of all rows in the worksheet worksheet.AllocatedRange.AutoFitRows() #Automatically adjust the width of all columns in the worksheet worksheet.AllocatedRange.AutoFitColumns() #Set automatic text wrapping for all cells in the worksheet worksheet.AllocatedRange.IsWrapText = True wb.SaveToFile("Output/WordToExcel.xlsx", ExcelVersion.Version2013);

Sincerely,
Amin
E-iceblue support team

Wed Oct 30, 2024 3:01 pm

Thanks again Amin for your response!

The code works fine except -
1) We want the 3 tables (1/3, 2/3, and 3/3) to combine horizontally (instead of vertically), i.e. ADD COLUMNS (not ROWS) of the 3 tables together. In other words, if the 3 tables each has 3 columns, the combined table should end up with 9 columns (and rows remain the same as each of the original tables).
2) In the output combined tables, there are duplicated header rows (they are from 'continued' sections for the same table), it will be great if the duplicated ones can be removed. See the attached 'duplicated_header_rows'.
3) The first sheet ('Sheet4') actually contains 2 tables from the doc (the 1st table just has one row), I tried a little to separate them with the code, but didn't figure out. If you can fix it, that will be great. See the attached 'incorrect_tables'.

Thank you!

Wed Oct 30, 2024 3:10 pm

To elaborate a little more -

To avoid duplicated header rows, the code should only pick the header row when the table title (or name) contains '1/3', '2/3', or '3/3' (the other way is to skip header row if the table title/name contains "continued from previous page").

Wed Oct 30, 2024 3:28 pm

I just noted that one column ('Displayed Version') in the doc lost 'Displayed' in the name after exported to spreadsheet. See the attached screenshot.

Thanks!

Fri Nov 01, 2024 1:59 am

Hello,

Thanks for your feedback. I have adjusted the code according to your requirements, please refer to the attachment. If you have any further questions, please feel free to provide feedback.

Sincerely,
Amin
E-iceblue support team

Fri Nov 01, 2024 2:35 pm

Hi Amin,

Thank you for the updated code!

It's my fault that I didn't explain clearly - the broken tables in 1/3 (i.e. table title with 1/3 AND any tables after it with '-continued from previous page', before the table of 2/3) should be added vertically - to the rows, I'm attaching the screenshots showing the adjustment.

If you can help update the code, that will be great!

Thank you!

Fri Nov 01, 2024 3:40 pm

Also, I tried to find more tech documentation of Spire products but no luck.

For example, your created empty sheet with the following code, but I would like to assign the original table name as the sheet name, but don't know how.

Code: Select all: worksheet = wb.CreateEmptySheet()

I hope it could be as easy as -

Code: Select all: worksheet = wb.CreateEmptySheet('sheet_name')

or

Code: Select all: worksheet = wb.CreateEmptySheet().Name('sheet_name')

but can't find any tech doc to refer.

Thank you!

Tue Nov 05, 2024 2:52 am

Hello,

Thanks for your inquiry. According to your requirements, I have attached the adjusted code for your reference. Additionally, you can refer to the following technical documents and feel free to provide feedback if you have any further questions.

Sincerely,
Amin
E-iceblue support team