Merging DOCX files natively in Apex.

When it comes to document generation in native Salesforce, I always assumed that your only options were something really simple like using Blob.toPDF or rendering a Visualforce page as a PDF and saving it as a file. Neither of there were very pragmatic - the Blob method left much to be desired in terms of formatting and the Visualforce approach was too hard to maintain. As such, when someone would ask for document generation I would just point them to an ISV like Conga or Drawloop. Surely working with a Word document wouldn’t be possible in Apex, right?

The DOCX file format.

I always assumed that a Word document was a complex file format, inscrutable to the likes of me. For a long time that was true - the original .doc file extension was a more proprietary format binary-based format. In the early 2000s, in response to open source solutions like Open Office, Microsoft created the DOCX file format that used XML.

In fact, a DOCX file is actually just a zipped file of various xml documents. You can try it right now - go to Google docs, type some stuff in it, click File >Download > Microsoft Doc (.docx). Unzip that file and in the resulting folder you can see the xml files yourself. The file we’ll focus on is is named document.xml.

Working with document.xml

I created a Word document that looked something like this, but formatted:


%%title%%

%%header1%%

%%header2%%

%%content%%


I unzipped that file and took a look at the document.xml file, searched from my merge fields, and they were relatively to find. For example, the “content” merge field was displayed like this:

<w:t xml:space="preserve">%%contentText%%</w:t>

Seeing that, I realized I could just do a find and replace on these merge fields, re-zip these files as a new archive and tada, basic document generation. And tada indeed it does work like that. Could I do this programmatically?

Document generation with Apex,

Reading an XML file in Apex is simple enough - just get the blob from a ContentVersion or an Attachment record and turn it into a string. But how do I get that document.xml file out of the docx file?

Unfortunately, there does not appear to be a native way to extract files from a zip file in Apex. Luckily someone has solved this problem for us. Enter Zippex, a native Apex Zip utility for the salesforce.com platform. I installed this into a scratch org, uploaded my .docx file, and started to break down the problem:

  • Unzip the docx file
  • Get the document.xml file
  • Do a find and replace on the merge fields
  • Rezip the file and save the changes.

I first started by just hard coding IDs to make this a little easier so I didn’t have to worry about querying for ContentDocuments:

ContentVersion cv = [
    SELECT VersionData
    FROM ContentVersion
    WHERE ContentDocumentId = 'SOME_CONTENT_DOCUMENT_ID'
    AND IsLatest = true
];

Zippex sampleZip = new Zippex(cv.VersionData);
Blob fileData = sampleZip.getFile('word/document.xml');
String docxml = fileData.toString();

docxml = docxml.replace(
  '%%title%%', 'Merged Title!',
  '%%header1%%', 'Merged Header 1!',
  '%%header2%%', 'Merged Header 2!',
  '%%content%%', 'Merged Content!',
);

Blob mergedData = Blob.valueof(docxml);
sampleZip.addFile('word/document.xml', mergedData, null);

String newTitle = 'merged' + Datetime.now().getTime();
ContentVersion mergedFile = new ContentVersion();
mergedFile.VersionData = sampleZip.getZipArchive();
mergedFile.Title = newTitle;
mergedFile.ContentLocation= 's';
mergedFile.PathOnClient= newTitle+'.docx';
insert mergedFile;

Id conDoc = [SELECT ContentDocumentId FROM ContentVersion WHERE Id =:mergedFile.Id].ContentDocumentId;
ContentDocumentLink cdl = new ContentDocumentLink();
cdl.ContentDocumentId = conDoc;
cdl.LinkedEntityId = 'SOME_PARENT_RECORD_ID';
cdl.ShareType = 'I';
cdl.Visibility = 'AllUsers';
insert cdl;

After I ran that, on my parent account record that I had put my docx attachment, I had a new docx file that had the merge fields populated!

Hulk saying 'Native Apex document generation

With a proof of concept I reworked the code a little bit so that it’s a little more reusable. Click here for the github repo.

With that code you can do something like this to generate documents with merge fields:

//This File must be a docx file
Id contentDocumentId = '<REPLACE_WITH_DOC_ID>';
Id parentId = '<REPLACE_WITH_PARENT_ID>';
Map<String,String> mergeFields = new Map<String,String>{
        '%%mergefield%%' =>  'I HAVE BEEN MERGED',
        '%%header1%%' =>  'NEW HEADER 1',
        '%%header2%%' =>  'NEW HEADER 2'
    };
new DocxMerger()
    .mergeDoc(contentDocumentId, parentId, mergeFields);

Too good to be true?

The document that I merged was tiny, but when I ran the code it took a while. The resulting debug log was 18 megabytes, so this seems to be an expensive operation. So I have no idea what would happen if I tried to merge a larger, more complex document. My hope is that the bottleneck is in the unzipping part itself and if that’s the case I can potentially try to do this client side using LWC. Of course, however, that would mean that you can’t bulkify this process via something like a batch job.

I’m actually quite surprised I got this far - usually I give up on these experiments but this wasn’t actually too hard. I’ll need to figure out where this project goes from here. Do I add more features, or does this end up in the pile of forgotten repos in GitHub? Either way, it was a cool experiment!

Secure By Default

Far too often I see code where security is treated more like a checkbox instead of as part of the architecture of the product. As long as my Apex classes are running with sharing and I enforce FLS/CRUD access in my SOQL queries and DML statements, my code is secure, right? Technically that’s true, but as is always the case, it really depends on the context.

Tell me if you heard this one before

Imagine a manufacturing company called “Can It Already!” that produces the cans for canned food (can you?) and they use Salesforce to manage their customer orders. The account team wants to make sure that customers are called by the same account executives in order to build a relationship. To accomodate tracking this, they have requested a few fields on the Account object.

  • The last executive to successfully call the customer.
  • When that last call took place.
  • The outcome of that call.

For the sake of this post, let’s pretend that adding these fields is the best solution (it’s not). The account team logs all of their calls as Tasks, so you figure a Task trigger is the way to go here. Below is the service class you build that gets called by the trigger:

public with sharing class AccountCallLoggingService {
  public void logCallOnAccount(Task call) {
    if (call.Status != 'Success' || !hasAccessToAccount(call)) {
      return;
    }

    accountToUpdate.LastCaller__c = call.OwnerId;
    accountToUpdate.LastCallStatus__c = call.Status;
    accountToUpdate.LastCalledOn__c = DateTime.now();

    update accountToUpdate;
  }

  private Boolean hasAccessToAccount(Task call){
    UserRecordAccess accountAccess = [
      SELECT RecordId, HasEditAccess
      FROM UserRecordAccess
      WHERE
        UserId = :UserInfo.getUserId()
        AND RecordId = :call.WhatId
    ];

    return
      Schema.sObjectType.Account.fields.LastCaller__c.isUpdateable()
      && Schema.sObjectType.Account.fields.LastCallStatus__c.isUpdateable()
      && Schema.sObjectType.Account.fields.LastCalledOn__c.isUpdateable()
      && accountAccess.HasEditAccess;
  }
}

A 10 foot wall with a wide open gate

This service will only update an Account if the running user has update access on the appropriate fields and on the Account itself. This looks pretty secure, right? The code is enforcing security checks, but in practice this might have some unintended consequences.

By enforcing FLS/CRUD and sharing access on the Account record, the account executive (i.e. the running user) must now have that corresponding access. Granting that access, however, gives the account executive those permissions not only in the context of this trigger, but also everywhere in the CRM. The business may not want an account executive to be able to go to an Account record and update those fields manually, or to even be able to update the Account record at all. So while the code checks security access, have you really made the application more secure?

To be clear, this is not necessarily wrong - the problem comes with blindly enforcing security without regard to your user’s experience and inadvertantly creating a security vulnerability. Enforcing frequent password changes may sound like a better security model, but actually makes security worse. Similarly, enforcing security checks everywhere in your code may sound more secure, but in practice your users may have to be granted more permissions just so your system is usuable.

The worst example of this I have seen this is in managed packages that are trying to pass security review and throw with sharing into every class and WITH SECURITY_ENFORCED into every query. The code might pass security review, but without a nuanced approach, the customer is forced to choose between not being able to use the functionality, or having to expand their FLS/CRUD and sharing settings to a model that accommodates the code base instead of the organization.

Running user vs System user

With this in mind, let’s revist our service code with some adjustments:

public without sharing class AccountCallLoggingService {
  public void logCallOnAccount(Task call) {
    if (call.Status != 'Success') {
      return;
    }

    accountToUpdate.LastCaller__c = call.OwnerId;
    accountToUpdate.LastCallStatus__c = call.Status;
    accountToUpdate.LastCalledOn__c = DateTime.now();

    update accountToUpdate;
  }
}

This code is running without sharing and doesn’t check the running user’s access. Generally this would set off some alarm bells, but I would argue that this is more secure. This functionality is more of a system level operation that should run the same regardless of the running user. I would rather escalate the running user’s permissions in this isolated context instead of being forced to grant that access to the user across the entire system.

The key lesson here is that enforcing security is a nuanced process that is not as simple as throwing in a few keywords into your code base. When designing your features, consider what it means for your users and customers to do so securely, in terms of usability, side effects, and of course, what it is you are truly trying to secure.

So-Called Advent Calendar Day 24 - Take a Break

In the spirit of the holiday season, this is a series of short blog posts covering random things I have learned while doing Salesforce development, one for each day of Advent.

lighted christmas tree

I didn’t actually think I would make it to the last day of the advent. I hardly post once a month so trying to write a blog post every day of the advent seemed a little overambitious (though I guess I can’t really speak to the quality of these posts).

I’ll be taking the next few days off to get away from work and this holiday season I hope you get an opportunity to take a rest, too.

Merry Christmas and happy holidays, wherever you are.