How to view documents using Visual Studio

View documents in Visual Studio with an amazing tool

Xpath Axes

A very useful trick for automation

Review: Spire.DataExport for .NET

A great tool for exporting data in .NET

How to install Arch Linux, step by step, for VMware Workstation (Part I)

First part of a installation tutorial for this beloved OS

How to setup a local repository in Ubuntu

The steps to have a local repo in Ubuntu

Thursday, December 10, 2015

How to create, edit Word files from .NET, C#


Introduction

The Word documents are almost like a standard in today’s business world. As some reader may confirm, no matter what is the industry, no matter what the companies do, they have something in common: office applications, and among these, maybe the most popular, Microsoft Office in its different versions. Taking that into account, developers often have to deal with this reality and adapt their software to the needs of the Company or the client they are working for. For example, it’s possible that all the information of an office is managed in Excel spreadsheets or Word documents, so it is necessary to process the information in the most efficient way possible. This article is a review, so the reader can consult it so she makes a proper decision when considering acquiring the product. The code is an example of the use of certain features of the product. I focused the article in those features that are different or particularly interesting.

Exploring Spire.doc

Spire.Doc has the following features:
  • Generating, Writing, Editing and Saving
  • Converting
  • Inserting, Editing and Removing Objects
  • Formatting
  • Mail Merge
I will test some aspects of those features showing you the code I’m using. Apart of that, I’ll make some comments about the ease of use, the available documentation and any other details worth mentioning.

Generating a new document with a custom phrase

This set of features contains the basic actions we can perform on Word documents. To test this, I’ll create a document with a custom phrase and I’ll display it from a simple .NET application. The steps to set this up were:
  1. Create a Windows Form Project in C#
  2. Add a button with the text “Display Doc” and with the name “displayButton”
  3. In the Design view, double click “Display Doc”. This opens the code for the click event of the button
  4. Now, I’ll insert the following code into the displayButton_Click method:
    //Create a new word document
    Document document = new Document();
    //Add a paragraph to the document
    Paragraph paragraph = document.AddSection().AddParagraph();
    //Add text to the pragraph
    paragraph.AppendText("This is a test");
    //Save the document. In this case, the document will be saved as Test.docx
    document.SaveToFile("Test.docx", FileFormat.Docx);
    try
    {
        System.Diagnostics.Process.Start("Test.docx");
    }
    catch { }
    
    Pretty straightforward, right? The good thing about the methods used by the library is that they are very intuitive.
  5. Debug the application
  6. Once the application is running, click “Display Doc”. You should see something like this:

Converting a document from DOCX to PDF

I’ll take the example above as the base to test this and the subsequent features. To convert a file, I’ll generate one and store in a known location (in this case, D:\). So, the following modifications were made to the code:
private void convertToPDF_Click(object sender, EventArgs e)
        {
            //Create a new word document
            Document document = new Document();
     document.LoadFromFile(@"D:\spire\test.docx");            
     document.SaveToFile(@"D:\testPDF.pdf", FileFormat.PDF);  
     System.Diagnostics.Process.Start(@"D:\spire\test.docx");
        }
As you can see, I am no longer opening the file and I saved the document in a particular path. Now, let’s convert the document to PDF and display it.



This is just an example, and as you can see, it works very well, with just few changes in the code. Note that the conversion logic is almost the same of creating documents.
Spire.doc supports the conversion from/to many file types. If you want a comprehensive explanation of all the conversion options, you can visit the official Spire.doc site.
Another point to take into account is that the conversion feature is not available for the Standard Edition, but only for the Pro Edition. If the file conversion is a critical feature for your application, you may want to purchase the Pro Edition.

Extracting images from a document

Spire.doc allows the user to completely manipulate Word documents from the code. Some of the features allow us to insert text, find and replace text, remove elements from the document, extract images, etc. The features that got my attention are those to:






  • Extract text, images and other elements from the document: With this feature, it is easy to extract important parts from documents. This can be used when you want to very some values, or to have a collection of images of important documents, etc.
  • Protect documents: It is possible that a lot of documents will be created once the application is running. If the documents are sensitive, they must be protected automatically, since they are too many to handle manually. Spire.doc can help us to do just that.
  • I’m going to extract an image from this word file:





    The following code extracts an image from a document and stores it.
    private void extractImage_Click(object sender, EventArgs e)
            {
                Document document = new Document(@"E:\spire\This is the second test.docx");
    
                int index = 0;
                //Get all the sections of the document
                foreach (Section section in document.Sections)
                {
                    //Get all the paragraphs
                    foreach (Paragraph paragraph in section.Paragraphs)
                    {
                        //Get the Document Objects of each paragraph
                        foreach (DocumentObject docObject in paragraph.ChildObjects)
                        {
                            //Extract the image if the Document Object is an image
                            if (docObject.DocumentObjectType == DocumentObjectType.Picture)
                            {
                                DocPicture picture = docObject as DocPicture;
                                String imageName = String.Format(@"E:\\spire\Image-{0}.png", index);
                                picture.Image.Save(imageName, System.Drawing.Imaging.ImageFormat.Png);
                                index++;
                            }
                        }
                    }
                }
            }
    
    After executing the code, you can see the extracted image at the specified location:



    The other feature I want to explore is the ability of protecting a document. To do that, I´ll use the document with the Earth picture from the last example. Note that this can be achieved by using just a few lines of code:
    private void protectDoc_Click(object sender, EventArgs e)
            {
                Document document = new Document(@"E:\spire\This is the second test.docx");
                document.Encrypt("password");
                document.SaveToFile(@"E:\spire\This is the second test.docx");
                System.Diagnostics.Process.Start(@"E:\spire\This is the second test.docx");
            }
    
    When you run this, you should see the document requesting a password:


    Formatting a document

    This is a set of features that allows us to manipulate all the formatting aspects of a document. This means changing the fonts, their sizes, changing backgrounds, managing tables, etc. I’ll use a custom document (test2.docx) and I’ll edit some of its elements:



    The code I will use for that is:
    private void applyStyle_Click(object sender, EventArgs e)
            {
                Paragraph p;
                Document document = new Document(@"E:\spire\test2.docx");
                Section docSection = document.Sections[0];
                ParagraphStyle style = new ParagraphStyle(document);
                style.Name = "TestStyle";
                style.CharacterFormat.TextColor = Color.Blue;
                style.CharacterFormat.FontName = "Arial";
                style.CharacterFormat.FontSize = 21;
                document.Styles.Add(style);
                p = docSection.Paragraphs[0];
                p.ApplyStyle(style.Name);
                p.ListFormat.ApplyBulletStyle();
    
                //Save and Launch
                document.SaveToFile("test2.docx", FileFormat.Docx);
                System.Diagnostics.Process.Start("test2.docx");
            }
    
    What this code will do is:







  • Change all the fonts to Arial
  • Set the font size to 21
  • Change the font color to blue
  • Add bullet style to the paragraph



  • The logic involved in the way the library handles this is interesting: It is based in styles. In the code above, first I set the style with all the details and then apply it to a paragraph. Of course, you could create different styles and apply them to different documents and parts of documents.

    Conclusion

    Spire.doc is a powerful library with great features, and it’s very easy to use. I only performed some action on simple documents, but I think this library has most of its strength in bulk operations. Actually, this kind of libraries can be more or less useful depending on the particular project, but this a library that you’ll definitely use a lot if your handle Word documents in any way.
    What I like about this product:
    • Very easy to use.
    • It’s comprehensively documented
    • There are a lot of code samples to get to know the product
    • The logic of use of the library is straightforward
    • It supports C#, VB.NET, ASP.NET, Web Services and WinForms for .NET Framework version from 2.0 to 4.5
    • What I didn't like:
    • The Standard version does not have the conversion feature.
    In general, I found this library extremely useful and easy to use. However, since al the projects are different, I’d encourage the users who want to buy the library for enterprise applications to try it first and see if it meets the project’s requirements. If it is so, this is a good investment.

    Check my reviews of other E-Icleblue products:
    Spire.Presentation
    Spire.XLS
    Spire.DataExport
    Spire.PDF