OpenXML Word templates processing
Office Open XML it’s the Microsoft’s zipped-xml based standard for processing Office documents. It allows to create, amend and process MS office files using e.g. .Net platform. In this article I will show you how to replace Word template with text and images using C# based application.
After creating new project we need to add two main references: DocumentFormat.OpenXml from DocumentFormat.OpenXml.dll and WindowsBase (.Net reference). Next, we need to prepare template with placeholders that our application will replace. For texts we will insert placeholder values in the the unique format that we can find parsing the document e.g. [#Project-Name#].
For the images we need to insert image placeholders (other images) and just remember their original names (e.g. myPicture2.jpg). This names needs to be provided in the parameter objects so we can find the placeholders by image name when parsing document.
The next step is to create parameters structure that we will use when processing the document. You may want to create your own parameters depending or your requirements.
public class WordParameter { public string Name { get; set; } public string Text { get; set; } public FileInfo Image { get; set; } }
We will use them as follows when initiating the objects
var templ = new WordTemplate(); //add parameters templ.WordParameters.Add(new WordParameter() { Name = "[#Project-Name#]", Text = "Test project 123" }); templ.WordParameters.Add(new WordParameter() { Name = "[#Features (string[])#]", Text = "Lorem Ipsum is simply dummy text of the printing \n Lorem Ipsum is simply dummy text of the printing \n Lorem Ipsum is simply dummy text of the printing..." }); //original image names to be replaced with the new ones templ.WordParameters.Add(new WordParameter() { Name = "1.jpg", Image = new FileInfo(WordTemplate.GetRootPath() + @"\Images\1.jpg") }); templ.WordParameters.Add(new WordParameter() { Name = "2.jpg", Image = new FileInfo(WordTemplate.GetRootPath() + @"\Images\2.jpg") }); templ.WordParameters.Add(new WordParameter() { Name = "3.jpg", Image = new FileInfo(WordTemplate.GetRootPath() + @"\Images\3.jpg") }); templ.WordParameters.Add(new WordParameter() { Name = "4.jpg", Image = new FileInfo(WordTemplate.GetRootPath() + @"\Images\4.jpg") }); templ.WordParameters.Add(new WordParameter() { Name = "5.jpg", Image = new FileInfo(WordTemplate.GetRootPath() + @"\Images\5.jpg") }); templ.ParseTemplate(); //create document from template
Having done that we can create our main function that parses the template and fills our placeholders with texts and images. Please see the inline comments within the function below.
public void ParseTemplate() { using (var templateFile = File.Open(templatePath, FileMode.Open, FileAccess.Read)) //read our template { using (var stream = new MemoryStream()) { templateFile.CopyTo(stream); //copy template using (var wordDoc = WordprocessingDocument.Open(stream, true)) //open word document { foreach (var paragraph in wordDoc.MainDocumentPart.Document.Descendants<Paragraph>().ToList()) //loop through all paragraphs { ReplaceImages(wordDoc, paragraph); //replace images ReplaceText(paragraph); //replace text } wordDoc.MainDocumentPart.Document.Save(); //save document changes we've made } stream.Seek(0, SeekOrigin.Begin);//scroll to stream start point //save file or overwrite it var outPath = WordTemplate.GetRootPath() + @"\Output\DocumentOutput.docx"; using (var fileStream = File.Create(outPath)) { stream.CopyTo(fileStream); } } } }
Function that replaces the images. It gets all Blip objects from the paragraph and changes it’s embed ID that points to the image.
The cool thing about it is fact that you can give your own styles and transformation to the image template and it will be preserved and applied to the new image 🙂
Please see the inline comments.
void ReplaceImages(WordprocessingDocument wordDoc, Paragraph paragraph) { // get all images in paragraph var imagesToReplace = paragraph.Descendants<A.Blip>().ToList(); if (imagesToReplace.Any()) { var index = 0;//image index within paragraph //find all original image names in paragraph var paragraphImageNames = paragraph.Descendants<DocumentFormat.OpenXml.Drawing.Pictures.NonVisualDrawingProperties>().ToList(); //check all images in the paragraph and replace them if it matches our parameter foreach (var imagePlaceHolder in paragraphImageNames) { //check if we have image parameter that matches paragraph image foreach (var param in WordParameters) { //replace it if found by original image name if (param.Image != null && param.Image.Name.ToLower() == imagePlaceHolder.Name.Value.ToLower()) { var imagePart = wordDoc.MainDocumentPart.AddImagePart(ImagePartType.Jpeg); //add image to document using (FileStream imgStream = new FileStream(param.Image.FullName, FileMode.Open)) { imagePart.FeedData(imgStream); //feed it with data } var relID = wordDoc.MainDocumentPart.GetIdOfPart(imagePart); // get relationship ID imagesToReplace.Skip(index).First().Embed = relID; //assign new relID, skip if this is another image in one paragraph } } index += 1; } } }
When replacing the texts we check if paragraph contains the text that matches our parameter. If yes, then we check if to include one or multiple lines of text.
Next we create new parameter by copying the old parameter’s OuterXML (this preserves the styles). We also need to replace text that is stored in our parameter.
void ReplaceText(Paragraph paragraph) { var parent = paragraph.Parent; //get parent element - to be used when removing placeholder var dataParam = new WordParameter(); if (ContainsParam(paragraph, ref dataParam)) //check if paragraph is on our parameter list { //insert text list if (dataParam.Name.Contains("string[]")) //check if param is a list { var arrayText = dataParam.Text.Split(Environment.NewLine.ToCharArray()); //in our case we split it into lines if (arrayText is IEnumerable) //enumerate if we can { foreach (var itemData in arrayText) { Paragraph bullet = CloneParaGraphWithStyles(paragraph, dataParam.Name, itemData);// create new param - preserve styles parent.InsertBefore(bullet, paragraph); //insert new element } } paragraph.Remove();//delete placeholder } else { //insert text line var param = CloneParaGraphWithStyles(paragraph, dataParam.Name, dataParam.Text); // create new param - preserve styles parent.InsertBefore(param, paragraph);//insert new element paragraph.Remove();//delete placeholder } } }
Creating the new paragraph object preserving the styles
public static Paragraph CloneParaGraphWithStyles(Paragraph sourceParagraph, string paramKey, string text) { var xmlSource = sourceParagraph.OuterXml; xmlSource = xmlSource.Replace(paramKey.Trim(), text.Trim()); return new Paragraph(xmlSource); }
Please note that when replacing images, the image placeholder names must be found in the document. Images that don’t match the parameter name, wont be replaced.
Having done that we can finally test our application. I have included complete working application for your tests.
WordTemplates