Posts

OpenXML Word templates processing

Office Open XML it’s the Microsoft’s zipped-xml based standard for processing Office documents. It allows to create, amend and process MS office files using e.g. .Net platform. In this article I will show you how to replace Word template with text and images using C# based application.

After creating new project we need to add two main references: DocumentFormat.OpenXml from DocumentFormat.OpenXml.dll and WindowsBase (.Net reference). Next, we need to prepare template with placeholders that our application will replace. For texts we will insert placeholder values in the the unique format that we can find parsing the document e.g. [#Project-Name#].

For the images we need to insert image placeholders (other images) and just remember their original names (e.g. myPicture2.jpg). This names needs to be provided in the parameter objects so we can find the placeholders by image name when parsing document.

word-template-300x184

The next step is to create parameters structure that we will use when processing the document. You may want to create your own parameters depending or your requirements.

  public class WordParameter
    {
        public string Name { get; set; }
        public string Text { get; set; }
        public FileInfo Image { get; set; }
    }

We will use them as follows when initiating the objects

    var templ = new WordTemplate();
    //add parameters
    templ.WordParameters.Add(new WordParameter() { Name = "[#Project-Name#]", Text = "Test project 123" });
    templ.WordParameters.Add(new WordParameter() { Name = "[#Features (string[])#]", Text = "Lorem Ipsum is simply dummy text of the printing \n Lorem Ipsum is simply dummy text of the printing \n Lorem Ipsum is simply dummy text of the printing..." });

    //original image names to be replaced with the new ones
    templ.WordParameters.Add(new WordParameter() { Name = "1.jpg", Image = new FileInfo(WordTemplate.GetRootPath() + @"\Images\1.jpg") });
    templ.WordParameters.Add(new WordParameter() { Name = "2.jpg", Image = new FileInfo(WordTemplate.GetRootPath() + @"\Images\2.jpg") });
    templ.WordParameters.Add(new WordParameter() { Name = "3.jpg", Image = new FileInfo(WordTemplate.GetRootPath() + @"\Images\3.jpg") });
    templ.WordParameters.Add(new WordParameter() { Name = "4.jpg", Image = new FileInfo(WordTemplate.GetRootPath() + @"\Images\4.jpg") });
    templ.WordParameters.Add(new WordParameter() { Name = "5.jpg", Image = new FileInfo(WordTemplate.GetRootPath() + @"\Images\5.jpg") });

    templ.ParseTemplate(); //create document from template

Having done that we can create our main function that parses the template and fills our placeholders with texts and images. Please see the inline comments within the function below.

 public void ParseTemplate()
 {
    using (var templateFile = File.Open(templatePath, FileMode.Open, FileAccess.Read)) //read our template
    {
        using (var stream = new MemoryStream())
        {
            templateFile.CopyTo(stream); //copy template

            using (var wordDoc = WordprocessingDocument.Open(stream, true)) //open word document
            {
                foreach (var paragraph in wordDoc.MainDocumentPart.Document.Descendants<Paragraph>().ToList()) //loop through all paragraphs 
                {
                    ReplaceImages(wordDoc, paragraph); //replace images

                    ReplaceText(paragraph); //replace text
                }

                wordDoc.MainDocumentPart.Document.Save(); //save document changes we've made
            }
            stream.Seek(0, SeekOrigin.Begin);//scroll to stream start point

            //save file or overwrite it
            var outPath = WordTemplate.GetRootPath() + @"\Output\DocumentOutput.docx";

            using (var fileStream = File.Create(outPath))
            {
                stream.CopyTo(fileStream);
            }
        }
    }
 }

Function that replaces the images. It gets all Blip objects from the paragraph and changes it’s embed ID that points to the image.

The cool thing about it is fact that you can give your own styles and transformation to the image template and it will be preserved and applied to the new image 🙂

Please see the inline comments.

 void ReplaceImages(WordprocessingDocument wordDoc, Paragraph paragraph)
 {
    // get all images in paragraph
    var imagesToReplace = paragraph.Descendants<A.Blip>().ToList();

    if (imagesToReplace.Any())
    {
        var index = 0;//image index within paragraph

        //find all original image names in paragraph
        var paragraphImageNames = paragraph.Descendants<DocumentFormat.OpenXml.Drawing.Pictures.NonVisualDrawingProperties>().ToList();

        //check all images in the paragraph and replace them if it matches our parameter
        foreach (var imagePlaceHolder in paragraphImageNames)
        {
            //check if we have image parameter that matches paragraph image
            foreach (var param in WordParameters)
            {
                //replace it if found by original image name
                if (param.Image != null && param.Image.Name.ToLower() == imagePlaceHolder.Name.Value.ToLower())
                {
                    var imagePart = wordDoc.MainDocumentPart.AddImagePart(ImagePartType.Jpeg); //add image to document
                    using (FileStream imgStream = new FileStream(param.Image.FullName, FileMode.Open))
                    {
                        imagePart.FeedData(imgStream); //feed it with data
                    }

                    var relID = wordDoc.MainDocumentPart.GetIdOfPart(imagePart); // get relationship ID

                    imagesToReplace.Skip(index).First().Embed = relID; //assign new relID, skip if this is another image in one paragraph
                }
            }
            index += 1;
        }
    }
}

When replacing the texts we check if paragraph contains the text that matches our parameter. If yes, then we check if to include one or multiple lines of text.
Next we create new parameter by copying the old parameter’s OuterXML (this preserves the styles). We also need to replace text that is stored in our parameter.

 void ReplaceText(Paragraph paragraph)
 {
    var parent = paragraph.Parent; //get parent element - to be used when removing placeholder
    var dataParam = new WordParameter();

    if (ContainsParam(paragraph, ref dataParam)) //check if paragraph is on our parameter list
    {
        //insert text list
        if (dataParam.Name.Contains("string[]")) //check if param is a list
        {
            var arrayText = dataParam.Text.Split(Environment.NewLine.ToCharArray()); //in our case we split it into lines

            if (arrayText is IEnumerable) //enumerate if we can
            {
                foreach (var itemData in arrayText)
                {
                    Paragraph bullet = CloneParaGraphWithStyles(paragraph, dataParam.Name, itemData);// create new param - preserve styles
                    parent.InsertBefore(bullet, paragraph); //insert new element
                }
            }
            paragraph.Remove();//delete placeholder
        }
        else
        {
            //insert text line
            var param = CloneParaGraphWithStyles(paragraph, dataParam.Name, dataParam.Text); // create new param - preserve styles
            parent.InsertBefore(param, paragraph);//insert new element

            paragraph.Remove();//delete placeholder
        }
    }
}

Creating the new paragraph object preserving the styles

 public static Paragraph CloneParaGraphWithStyles(Paragraph sourceParagraph, string paramKey, string text)
 {
    var xmlSource = sourceParagraph.OuterXml;

    xmlSource = xmlSource.Replace(paramKey.Trim(), text.Trim());

    return new Paragraph(xmlSource);
 }

Please note that when replacing images, the image placeholder names must be found in the document. Images that don’t match the parameter name, wont be replaced.

Having done that we can finally test our application. I have included complete working application for your tests.
WordTemplates

1 Star2 Stars3 Stars4 Stars5 Stars (No Ratings Yet)
Loading...Loading...