Knowledgebase (2143)
While Word is a powerful tool for creating and formatting documents, it is not optimized for advanced data management and analysis. In contrast, Excel excels at handling data in tabular form, allowing users to perform calculations, create charts, and conduct thorough data analysis.
Generally, converting complex Word documents into Excel spreadsheets is not advisable, as Excel may struggle to preserve the original layout. However, if your Word document primarily consists of tables, converting it to Excel can be highly beneficial. This transformation unlocks Excel's advanced functions, formulas, and visualization tools, enabling you to organize your data more effectively and improve your reporting and decision-making capabilities. In this article, we will focus specifically on how to convert this kind of Word documents to Excel in Python using Spire.Office for Python.
Install Spire.Office for Python
This scenario requires Spire.Office for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip command.
pip install Spire.Office
Convert Word to Excel in Python
This process uses two libraries in the Spire.Office for Python package. They're Spire.XLS for Python and Spire.Doc for Python. The former is used to read and extract content from a Word document, and the latter is used to create an Excel document and write data in specific cells. To make this code example easy to understand, we have defined the following three custom methods that handle specific tasks:
- ExportTableInExcel() - Export data from a Word table to specified Excel cells.
- CopyContentInTable() - Copy content from a table cell in Word to an Excel cell.
- CopyTextAndStyle() - Copy text with formatting from a Word paragraph to an Excel cell.
The following steps demonstrate how to export data from an entire Word document to an Excel worksheet using Spire.Office for Python.
- Create a Document object to load a Word file.
- Create a Worbbook object and add a worksheet named "WordToExcel" to it.
- Traverse through all the sections in the Word document and all the document objects under a certain section, and then determine if a document object is a paragraph or a table.
- If the document object is a paragraph, write the paragraph in a specified cell in Excel using CoypTextAndStyle() method.
- If the document object is a table, export the table data from Word to Excel cells using ExportTableInExcel() method.
- Auto fit the row height and column width in Excel so that the data within a cell will not exceed the bound of the cell.
- Save the workbook to an Excel file using Workbook.SaveToFile() method.
- Python
from spire.xls import * from spire.doc import * # Export data from Word table to Excel cells def ExportTableInExcel(worksheet, row, table): for rowIndex in range(len(table.Rows)): tbRow = table.Rows[rowIndex] column = 1 for cellIndex in range(len(tbRow.Cells)): tbCell = tbRow.Cells[cellIndex] cell = worksheet.Range[row, column] cell.BorderAround() CopyContentInTable(worksheet, tbCell, cell) column += 1 row += 1 return row # Copy content from a Word table cell to an Excel cell def CopyContentInTable(worksheet, tbCell, cell): newPara = Paragraph(tbCell.Document) for i in range(len(tbCell.ChildObjects)): documentObject = tbCell.ChildObjects[i] if isinstance(documentObject, Paragraph): paragraph = documentObject for cObj in range(len(paragraph.ChildObjects)): newPara.ChildObjects.Add(paragraph.ChildObjects[cObj].Clone()) if i < len(tbCell.ChildObjects) - 1: newPara.AppendText("\n") CopyTextAndStyle(worksheet, cell, newPara) # Copy text and style of a paragraph to a cell def CopyTextAndStyle(worksheet, cell, paragraph): richText = cell.RichText richText.Text = paragraph.Text startIndex = 0 for documentObject in range(len(paragraph.ChildObjects)): documentObject = paragraph.ChildObjects[documentObject] if isinstance(documentObject, TextRange): textRange = documentObject fontName = textRange.CharacterFormat.FontName isBold = textRange.CharacterFormat.Bold textColor = textRange.CharacterFormat.TextColor fontSize = textRange.CharacterFormat.FontSize textRangeText = textRange.Text strLength = len(textRangeText) font = worksheet.Workbook.CreateFont() font.Color = textColor font.IsBold = isBold font.Size = fontSize font.FontName = fontName endIndex = startIndex + strLength richText.SetFont(startIndex, endIndex, font) startIndex += strLength if isinstance(documentObject, DocPicture): picture = documentObject worksheet.Pictures.Add(cell.Row, cell.Column, picture.Image) worksheet.SetRowHeightInPixels(cell.Row, 1, picture.Image.Height) if paragraph.Format.HorizontalAlignment == HorizontalAlignment.Left: cell.Style.HorizontalAlignment = HorizontalAlignType.Left elif paragraph.Format.HorizontalAlignment == HorizontalAlignment.Center: cell.Style.HorizontalAlignment = HorizontalAlignType.Center elif paragraph.Format.HorizontalAlignment == HorizontalAlignment.Right: cell.Style.HorizontalAlignment = HorizontalAlignType.Right # Create a Document object doc = Document() # Load a Word file doc.LoadFromFile("C:/Users/Administrator/Desktop/Invoice.docx") # Create a Workbook object wb = Workbook() # Remove the default worksheets wb.Worksheets.Clear() # Create a worksheet named "WordToExcel" worksheet = wb.CreateEmptySheet("WordToExcel") row = 1 column = 1 # Loop through the sections in the Word document for sec_index in range(doc.Sections.Count): section = doc.Sections[sec_index] # Loop through the document object under a certain section for obj_index in range(section.Body.ChildObjects.Count): documentObject = section.Body.ChildObjects[obj_index] # Determine if the object is a paragraph if isinstance(documentObject, Paragraph): cell = worksheet.Range[row, column] paragraph = documentObject # Copy paragraph from Word to a specific cell CopyTextAndStyle(worksheet, cell, paragraph) row += 1 # Determine if the object is a table if isinstance(documentObject, Table): table = documentObject # Export table data from Word to Excel currentRow = ExportTableInExcel(worksheet, row, table) row = currentRow # Auto fit row height and column width worksheet.AllocatedRange.AutoFitRows() worksheet.AllocatedRange.AutoFitColumns() # Wrap text in cells worksheet.AllocatedRange.IsWrapText = True # Save the workbook to an Excel file wb.SaveToFile("WordToExcel.xlsx", ExcelVersion.Version2013) wb.Dispose() doc.Dispose()
Apply for a Temporary License
If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.
C#: Copy Rows, Columns and Cells in Excel with Formatting
2024-12-12 07:27:19 Written by support iceblueWhen working with Excel in C#, copying rows, columns, and cells can feel like a simple task, but it often comes with a catch—how to keep the formatting intact. Whether you’re organizing data for a report, creating a presentation, or just trying to keep your spreadsheet looking sharp, maintaining the original look is crucial. In this article, we will demonstrate the methods to copy rows, columns, and cells in Excel while preserving the original formatting in C# using Spire.XLS for .NET.
- Copy Rows in Excel with Formatting in C#
- Copy Columns in Excel with Formatting in C#
- Copy Cells in Excel with Formatting in C#
Install Spire.XLS for .NET
To begin with, you need to add the DLL files included in the Spire.XLS for.NET package as references in your .NET project. The DLLs files can be either downloaded from this link or installed via NuGet.
PM> Install-Package Spire.XLS
Copy Rows in Excel with Formatting in C#
Copying rows in Excel while preserving their formatting can be efficiently achieved using the Worksheet.CopyRow(CellRange sourceRow, Worksheet destSheet, int destRowIndex, CopyRangeOptions copyOptions) method. This method enables you to duplicate rows either within the same worksheet or across different worksheets with precision. Additionally, you can control the copying behavior, such as copying all formatting, conditional formatting, data validations, styles, or even just the formula values, through the CopyRangeOptions parameter.
The following steps explain how to copy rows across different worksheets with formatting using Spire.XLS for .NET.
- Create an object of the Workbook class.
- Load an Excel file using the Workbook.LoadFromFile() method.
- Get the source worksheet and the destination worksheet using the Workbook.Worksheets[index] property.
- Get the desired row that you want to copy using the Worksheet.Rows[index] property.
- Copy the row and its formatting from the source worksheet to the destination worksheet using the Worksheet.CopyRow(CellRange sourceRow, Worksheet destSheet, int destRowIndex, CopyRangeOptions copyOptions) method.
- Copy the column widths of cells in the source row to the corresponding cells in the destination row.
- Save the workbook to a file using the Workbook.SaveToFile() method.
- C#
using Spire.Xls; namespace CopyRows { internal class Program { static void Main(string[] args) { // Create a Workbook object Workbook workbook = new Workbook(); // Load an Excel file workbook.LoadFromFile("ContactList.xlsx"); // Get the source worksheet Worksheet sheet1 = workbook.Worksheets[0]; // Get the destination worksheet Worksheet sheet2 = workbook.Worksheets[1]; // Get the desired row that you want to copy CellRange row = sheet1.Rows[0]; // Copy the row from the source worksheet to the first row of the destination worksheet sheet1.CopyRow(row, sheet2, 1, CopyRangeOptions.All); int columns = sheet1.Columns.Length; // Copy the column widths of the cells in the source row to the corresponding cells in the destination row for (int i = 0; i < columns; i++) { double columnWidth = row.Columns[i].ColumnWidth; sheet2.Rows[0].Columns[i].ColumnWidth = columnWidth; } // Save the workbook to a file workbook.SaveToFile("CopyRow.xlsx", ExcelVersion.Version2016); workbook.Dispose(); } } }
Copy Columns in Excel with Formatting in C#
Similarly, copying columns in Excel with formatting can be accomplished using the Worksheet.CopyColumn(CellRange sourceColumn, Worksheet destSheet, int destColIndex, CopyRangeOptions copyOptions) method. The detailed steps are as follows.
- Create an object of the Workbook class.
- Load an Excel file using the Workbook.LoadFromFile() method.
- Get the source worksheet and the destination worksheet using the Workbook.Worksheets[index] property.
- Get the desired column that you want to copy using the Worksheet.Columns[index] property.
- Copy the column and its formatting from the source worksheet to the destination worksheet using the Worksheet.CopyColumn(CellRange sourceColumn, Worksheet destSheet, int destColIndex, CopyRangeOptions copyOptions) method.
- Copy the row heights of cells in the source column to the corresponding cells in the destination column.
- Save the workbook to a file using the Workbook.SaveToFile() method.
- C#
using Spire.Xls; namespace CopyColumns { internal class Program { static void Main(string[] args) { // Create a Workbook object Workbook workbook = new Workbook(); // Load an Excel file workbook.LoadFromFile("ContactList.xlsx"); // Get the source worksheet Worksheet sheet1 = workbook.Worksheets[0]; // Get the destination worksheet Worksheet sheet2 = workbook.Worksheets[1]; // Get the desired column that you want to copy CellRange column = sheet1.Columns[0]; // Copy the column from the source worksheet to the first column of the destination worksheet sheet1.CopyColumn(column, sheet2, 1, CopyRangeOptions.All); int rows = column.Rows.Length; // Copy the row heights of cells in the source column to the corresponding cells in the destination column for (int i = 0; i < rows; i++) { double rowHeight = column.Rows[i].RowHeight; sheet2.Columns[0].Rows[i].RowHeight = rowHeight; } // Save the workbook to a file workbook.SaveToFile("CopyColumn.xlsx", ExcelVersion.Version2016); workbook.Dispose(); } } }
Copy Cells in Excel with Formatting in C#
In addition to copying rows and columns in Excel with formatting, Spire.XLS for .NET also allows copying cell ranges with formatting using the CellRange.Copy(CellRange destRange, CopyRangeOptions copyOptions) method. The detailed steps are as follows.
- Create an object of the Workbook class.
- Load an Excel file using the Workbook.LoadFromFile() method.
- Get the source worksheet and the destination worksheet using the Workbook.Worksheets[index] property.
- Get the source cell range and the destination cell range using the Worksheet.Range[] property.
- Copy the source cell range and its formatting from the source worksheet to the destination cell range in the destination worksheet using the CellRange.Copy(CellRange destRange, CopyRangeOptions copyOptions) method.
- Copy the row heights and column widths of the source cell range to the destination cell range.
- Save the workbook to a file using the Workbook.SaveToFile() method.
- C#
using Spire.Xls; namespace CopyCells { internal class Program { static void Main(string[] args) { // Create a Workbook object Workbook workbook = new Workbook(); // Load an Excel file workbook.LoadFromFile("ContactList.xlsx"); // Get the source worksheet Worksheet sheet1 = workbook.Worksheets[0]; // Get the destination worksheet Worksheet sheet2 = workbook.Worksheets[1]; // Get the source cell range CellRange range1 = sheet1.Range["A1:E7"]; // Get the destination cell range CellRange range2 = sheet2.Range["A1:E7"]; // Copy the source cell range from the source worksheet to the destination cell range in the destination worksheet range1.Copy(range2, CopyRangeOptions.All); // Copy the row heights and column widths of the source cell range to the destination cell range for (int i = 0; i < range1.Rows.Length; i++) { CellRange row = range1.Rows[i]; for (int j = 0; j < row.Columns.Length; j++) { CellRange column = row.Columns[j]; range2.Rows[i].Columns[j].ColumnWidth = column.ColumnWidth; range2.Rows[i].RowHeight = row.RowHeight; } } // Save the workbook to a file workbook.SaveToFile("CopyCells.xlsx", ExcelVersion.Version2016); workbook.Dispose(); } } }
Apply for a Temporary License
If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.
XML is often used for data interchange between different systems, while Excel is a widely recognized format for data analysis and reporting. By converting XML data to Excel, you can leverage Excel's powerful features to analyze and visualize the data more effectively. This conversion process is essential in various industries, including finance, healthcare, and e-commerce.
In this article, you will learn how to convert XML to Excel and PDF in C# using Spire.XLS for .NET.
Install Spire.XLS for .NET
To begin with, you need to add the DLL files included in the Spire.XLS for .NET package as references in your .NET project. The DLL files can be either downloaded from this link or installed via NuGet.
PM> Install-Package Spire.XLS
Understanding XML Structure: Elements, Attributes, and Data
Before converting XML to Excel, it's crucial to understand the structure of XML files. XML is a markup language that uses tags to define elements, attributes, and data. Here’s a breakdown of these components:
- Elements: These are the building blocks of XML. They are defined by start and end tags and can contain data or other elements.
<person> <name>John Doe</name> <age>30</age> </person>
- Attributes: These provide additional information about elements. They are specified within the start tag of an element.
<person id="1"> <name>John Doe</name> <age>30</age> </person>
- Data: This is the content enclosed within the start and end tags of an element.
Understanding these components will help you map XML data to Excel effectively.
Convert XML to Excel in C#
In .NET, you can use the System.Xml.Linq namespace, which provides classes for working with XML files. The primary class used is XDocument, which allows you to load, navigate, and manipulate XML documents effortlessly.
Here's an example:
- C#
using System; using System.Xml.Linq; class Program { static void Main() { // Load the XML file XDocument doc = XDocument.Load("data.xml"); XElement root = doc.Root; // Iterate through elements foreach (XElement person in root.Elements("person")) { string name = person.Element("name")?.Value; string age = person.Element("age")?.Value; // Output the name and age Console.WriteLine($"Name: {name}, Age: {age}"); } } }
After parsing the XML data, the next step is to map it to an Excel worksheet. You can use Spire.XLS for .NET to create a new workbook, input data into specific cells, and apply various styles and formatting options. These include auto-fitting column widths, adjusting text alignment, and making the header bold.
To convert XML to Excel in C#, follow these steps:
- Utilize the System.Xml.Linq library to extract data from the XML file.
- Create a Workbook object.
- Add a worksheet using the Workbook.Worksheets.Add() method.
- Write the extracted data into the worksheet cells using the Worksheet.SetValue() method.
- Apply styles and formatting to enhance the appearance of the worksheet.
- Save the workbook to an Excel file using the Workbook.SaveToFile() method.
The following code demonstrates an efficient and advanced method for reading data from XML and importing it into an Excel file.
- C#
using Spire.Xls; using System.Xml.Linq; namespace ConvertXmlToExcel { class Program { static void Main(string[] args) { // Create a Workbook object Workbook workbook = new Workbook(); // Remove default worksheets workbook.Worksheets.Clear(); // Add a worksheet and name it Worksheet worksheet = workbook.Worksheets.Add("Books"); // Load an XML file XDocument xmlDoc = XDocument.Load(@"C:\Users\Administrator\Desktop\Books.xml"); XElement xmlRoot = xmlDoc.Root; // Get the first "book" element XElement firstBook = xmlRoot.Element("book"); // Extract header information and convert it into a list var headers = firstBook.Elements().ToList(); // Write header to Excel for (int colIndex = 0; colIndex < headers.Count; colIndex++) { string headerText = headers[colIndex].Name.LocalName; worksheet.SetValue(1, colIndex + 1, headerText); } // Write other data to Excel by iterating over each book element and each data node within it int rowIndex = 2; foreach (XElement book in xmlRoot.Elements("book")) { var dataNodes = book.Elements().ToList(); for (int colIndex = 0; colIndex < dataNodes.Count; colIndex++) { string value = dataNodes[colIndex].Value; worksheet.SetValue(rowIndex, colIndex + 1, value); } rowIndex++; } // Set column width worksheet.AllocatedRange.AutoFitColumns(); // Set alignment worksheet.AllocatedRange.HorizontalAlignment = HorizontalAlignType.Left; // Set font style worksheet.Range["A1:F1"].Style.Font.IsBold = true; // Save the workbook to an Excel file workbook.SaveToFile("output/XmlToExcel.xlsx"); // Dispose resources workbook.Dispose(); } } }
Convert XML to PDF in C#
The previous example effectively imports data from an XML file into an Excel worksheet. This worksheet can subsequently be converted to a PDF file using the Worksheet.SaveToPdf() method. To ensure a well-structured PDF, you may want to adjust page layout settings, such as margins and the preservation of gridlines, during the conversion process.
Here are the steps to convert XML to PDF using C#:
- Use the System.Xml.Linq library to retrieve data from the XML file.
- Create a Workbook object.
- Add a worksheet with the Workbook.Worksheets.Add() method.
- Populate the worksheet cells with data extracted from the XML file using the Worksheet.SetValue() method.
- Apply styles and formatting to improve the worksheet's appearance.
- Configure page settings using properties from the PageSetup object, accessible via Worksheet.PageSetup.
- Save the worksheet as a PDF file using the Worksheet.SaveToPdf() method.
The following code snippet illustrates how to import data from XML into a worksheet and then save that worksheet as a PDF file.
- C#
using Spire.Xls; using Spire.Xls.Core; using System.Xml.Linq; namespace ConvertXmlToPdf { class Program { static void Main(string[] args) { // Create a Workbook object Workbook workbook = new Workbook(); // Remove default worksheets workbook.Worksheets.Clear(); // Add a worksheet and name it Worksheet worksheet = workbook.Worksheets.Add("Books"); // Load an XML file XDocument xmlDoc = XDocument.Load(@"C:\Users\Administrator\Desktop\Books.xml"); XElement xmlRoot = xmlDoc.Root; // Get the first "book" element XElement firstBook = xmlRoot.Element("book"); // Extract header information and convert it into a list var headers = firstBook.Elements().ToList(); // Write header to Excel for (int colIndex = 0; colIndex < headers.Count; colIndex++) { string headerText = headers[colIndex].Name.LocalName; worksheet.SetValue(1, colIndex + 1, headerText); } // Write other data to Excel by iterating over each book element and each data node within it int rowIndex = 2; foreach (XElement book in xmlRoot.Elements("book")) { var dataNodes = book.Elements().ToList(); for (int colIndex = 0; colIndex < dataNodes.Count; colIndex++) { string value = dataNodes[colIndex].Value; worksheet.SetValue(rowIndex, colIndex + 1, value); } rowIndex++; } // Set column width worksheet.AllocatedRange.AutoFitColumns(); // Set alignment worksheet.AllocatedRange.HorizontalAlignment = HorizontalAlignType.Left; // Set font style worksheet.Range["A1:F1"].Style.Font.IsBold = true; // Fit worksheet on one page workbook.ConverterSetting.SheetFitToPage = true; // Get the PageSetup object PageSetup pageSetup = worksheet.PageSetup; // Set page margins pageSetup.TopMargin = 0.3; pageSetup.BottomMargin = 0.3; pageSetup.LeftMargin = 0.3; pageSetup.RightMargin = 0.3; // Preserve gridlines pageSetup.IsPrintGridlines = true; // Save the worksheet to a PDF file worksheet.SaveToPdf("output/XmlToPdf.pdf"); // Dispose resources workbook.Dispose(); } } }
Apply for a Temporary License
If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.