News Category

C#/VB.NET: Extract Text from PowerPoint Presentations

2023-05-15 03:08:00 Written by  support iceblue
Rate this item
(0 votes)

When sending a PowerPoint document containing a lot of media files and images to others for text proofreading, you may find that the transfer speed is quite slow because of the large file size. In such a case, it is better to extract the text from PowerPoint to MS Word or Notepad first, and then send only the text content. In addition, the extracted text content can also be archived or backed up for future reference. In this article, you will learn how to extract text from a PowerPoint Presentation in C# and VB.NET using Spire.Presentation for .NET.

Install Spire.Presentation for .NET

To begin with, you need to add the DLL files included in the Spire.Presentation for.NET package as references in your .NET project. The DLL files can be either downloaded from this link or installed via NuGet.

PM> Install-Package Spire.Presentation

Extract Text from PowerPoint Presentations in C# and VB.NET

To facilitate the sharing or delivery of text information in a PowerPoint document, text extraction is an operation occasionally required. The following are the steps to extract text from all presentation slides and save in a TXT file.

  • Initialize an instance of the Presentation class.
  • Load a sample PowerPoint document using Presentation.LoadFromFile() method.
  • Create a StringBuilder instance.
  • Iterate through each slide in the document, and then iterate through all the shapes in each slide.
  • Determine whether the shapes are of IAutoShape type. If yes, iterate through all the paragraphs in each shape and get the paragraph text using TextParagraph.Text property.
  • Append the extracted text to the StringBuilder instance using StringBuilder.AppendLine() method
  • Create a new txt file and write the extracted text to the file using File.WriteAllText() method.
  • C#
  • VB.NET
using Spire.Presentation;
using System.IO;
using System.Text;
namespace ExtractText
{
    class Program
    {
        static void Main(string[] args)
        {
            //Initialize an instance of the Presentation class
            Presentation presentation = new Presentation();

            //Load a sample PowerPoint document
            presentation.LoadFromFile("Island.pptx");
            //Create a StringBuilder instance
            StringBuilder sb = new StringBuilder();

            //Iterate through each slide in the document
            foreach (ISlide slide in presentation.Slides)
            {
                //Iterate through each shape in each slide
                foreach (IShape shape in slide.Shapes)
                {
                    //Check if the shape is of IAutoShape type
                    if (shape is IAutoShape)
                    {
                        //Iterate through all paragraphs in each shape
                        foreach (TextParagraph tp in (shape as IAutoShape).TextFrame.Paragraphs)
                        {
                            //Extract text and save to StringBuilder instance
                            sb.AppendLine(tp.Text);
                        }
                    }
                }
            }
            //Create a new txt file to save the extracted text
            File.WriteAllText("ExtractText.txt", sb.ToString());
        }
    }
}

C#/VB.NET: Extract Text from PowerPoint Presentations

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Additional Info

  • tutorial_title:
Last modified on Monday, 15 May 2023 01:20