Populating microsoft word documents

In a recent project, a client asked for a feature that he would be able to download word documents templates that should be pre-populated with some data based on the user record in the database. The idea we used here is that we add unique text place holders in the document templates to replace them later with the needed data. The project environment was Linux(Ubuntu), as such we cannot rely on modules like pywin32 http://sourceforge.net/projects/pywin32/ in our document generation.

On investigating word documents it was found out that docx files are actually made and described specifically using xml. If you change a .docx file extension to .zip and investigate the .zip folder you will find that it contains the following structure:
[Content_Types].xml

word/

_rels/

docProps/

The standard Microsoft released for docx documents was released in 2007 and was called Open Xml which you can know more about from http://msdn.microsoft.com/en-us/library/aa338205%28v=office.12%29.aspx

Inside the word/ folder is the key xml describing the the document document.xml. We manipulate the place holders we put in the templates from this file document.xml. and replace these place holders with the required data based on each user to give us the file for the user to download. The code below covers generating and populating the template with the variables that are already given in the dictionary variable replacements.water slides for sale

import zipfile
#method to render templates
def render_template(template_path, output_path, replacements):
  #we convert the docx to zip files via the zipfile built-in library
  template_document = zipfile.ZipFile(template_path)
  rendered_document = zipfile.ZipFile(output_path, "a")

  #read the document.xml after extracting it from zip file
  with open(template_document.extract("word/document.xml")) as tmp_xml_file:
    tmp_xml_str = tmp_xml_file.read()

  #replace place holders with the values from the replacements dictionary
  for key in replacements.keys():
    tmp_xml_str = tmp_xml_str.replace(str(key), str(replacements.get(key)))

  #write the changed xml with replaced strings
  with open("tmp.xml", "w+") as tmp_xml_file:
    tmp_xml_file.write(tmp_xml_str)

  #copy the same files from the template archive to the new to be rendered template except document.xml
  for file in template_document.filelist:
    if not file.filename == "word/document.xml":
      rendered_document.writestr(file.filename, template_document.read(file))

  #write the modified document.xml to the new doc
  rendered_document.write("tmp.xml", "word/document.xml")

  template_document.close()
  rendered_document.close()

With this you have normal dosage of cialis your Microsoft word templates and all you need to do is provide a dictionary with the place holders and their values in the replacements dictionary. The place holders should be unique per document and generally not something you would find in the document more than once.  I generally follow the convention [[place holder 1]], [[place holder 2]], …. ect so more generally [[unique word/place holder]].

This is a good approach but needs to be handled with care. Imagine someone edited one of the place holders whenever you call the function with the correct dictionary the place holder won’t be replaced if it is not the same key in the document.

One thought to “Populating microsoft word documents”

Comments are closed.