Refactoring: How do I even start?

Let's face it: working with sloppy, legacy code is inevitable. Whether from a rushed developer that's long left the company, or even just you from 6 months ago, eventually you are going to find yourself with big mess of muck that you suddenly need to modify. You have no idea what's going on and there's a bug that cannnot be consistently reproduced. Also, you have until the end of the day and as far as your manager is concerned, refactoring is that thing you learned in middle school to break down numbers.

Also, this guy is your boss.

You are probably a little overwhelmed and feeling pressure from people who don't fully understand the extent of the issue. Maybe you want to build a time machine and stop the original author from doing this (which right now may seem like the easier solution).

It's ok, take a moment to close your eyes and take a deep breath. This too shall pass.

Or scream. That's a good one too.

If you had more time you might be able refactor a lot of the code and maybe implement some design patterns. And if that was the case, check out sourcemaking.com for an extensive list of ideas. But you don't always have that time; sometimes you just need a little help making the code make SOME sense.

Easy ways to clean up code

The key is to start small and stay within a small scope. It'll be easier to keep track of things and staying within a small scope will lessen the chance of side effects. Here are couple of easy, non-invasive techniques to help you make sense of your code and clean it up at the same time

Using and Renaming Variables

Let's say you find a nested for-loop like this:

for( Integer i = 0; i < list1.length(); i++ )
{
for( Integer x = 0; i < list1[i].list2.length(); x++)
{
for( Integer y = 0; i < list1[i].list2[y].list3.length(); y++)
{
if( list1[i].list2[x].list3[y].Name == list1[i].Name && list1[i].list2[x].list3[y].Name == list1[i].list2[x].Name )
{
list1[i].list2[x].list3[y].Name += ' III';
}
}
}
}

You have no idea what these lists are and the chaining is making it difficult to keep track of what is being compared and what is being assigned. But, what we do know is that these lists are related in some kind of hierarchy. So let's rename some of these variables and place repetitive references into new variables

for( Integer i = 0; i < list1.length(); i++ )
{
Object grandparent = grandparent;
List<Object> parents = grandparent.list2;
for( Integer x = 0; i < parents.length(); x++)
{
Object aParent = parents[x];
List<Object> children = aParent.list3;
for( Integer y = 0; i < children.length(); y++)
{
Object aChild = children[y];
if( aChild.Name == grandparent.Name && aChild.Name == aParent.Name )
{
aChild.Name += ' III';
}
}
}
}

A step forward. In our mental map of the code we no longer need to tranverse any hierarchies to figure out what is being operated on and there's a little more context to our variables.

Composing Methods

One of the keys to understanding messy code is reducing the amount of information you need to keep track of in your mental map. Here's the method we worked above with the rest of the method:

public void processList( List<Object> masterList ){
List<Object> list1 = new List<Object>();
for( Integer z = 0; z < masterList.length(); z++ )
{
if( masterList[z].list2 != NULL masterList[z].list2.length() > 0 )
{
for( Integer y = 0; y < masterList[z].list2.length(); y++ )
{
if( masterList[z].list2 != NULL masterList[z].list2.length() > 0 )
{
list1.add(masterList[z]);
}
}
}
}

for( Integer i = 0; i < list1.length(); i++ )
{
Object grandparent = grandparent;
List<Object> parents = grandparent.list2;
for( Integer x = 0; i < parents.length(); x++)
{
Object aParent = parents[x];
List<Object> children = aParent.list3;
for( Integer y = 0; i < children.length(); y++)
{
Object aChild = children[y];
if( aChild.Name == grandparent.Name && aChild.Name == aParent.Name )
{
aChild.Name += ' III';
}
}
}
}
}

After looking at the code for a little bit, you realize that the first for loop is building a list of objects that have both children and grandchildren in its hierarchy. Also you remember that the second for loop is adding a suffix to the grandchildren's name. Instead of having to remember what lines do what, let's move some of this logic out into different methods:

public void processList( List<Object> masterList ){
List<Object> list1 = filterObjectsWithGranchildren( masterList );
appendSuffixToGrandchildren( list1 );
}

private List<Object> filterObjectsWithGranchildren( List<Object> listToFilter ){
List<Object> filteredList = new List<Object>();
for( Integer z = 0; z < listToFilter.length(); z++ )
{
if( listToFilter[z].list2 != NULL listToFilter[z].list2.length() > 0 )
{
for( Integer y = 0; y < listToFilter[z].list2.length(); y++ )
{
if( listToFilter[z].list2 != NULL listToFilter[z].list2.length() > 0 )
{
filteredList.add(listToFilter[z]);
}
}
}
}
return filteredList;
}

private void appendSuffixToGrandchildren( List<Object> grandparents ){
for( Integer i = 0; i < grandparents.length(); i++ )
{
Object grandparent = grandparent;
List<Object> parents = grandparent.list2;
for( Integer x = 0; i < parents.length(); x++)
{
Object aParent = parents[x];
List<Object> children = aParent.list3;
for( Integer y = 0; i < children.length(); y++)
{
Object aChild = children[y];
if( aChild.Name == grandparent.Name && aChild.Name == aParent.Name )
{
aChild.Name += ' III';
}
}
}
}
}

Now our original method calls two private methods. The logic is still the same, but now we can quickly understand what our original method does without having to read through all of the logic.

When it comes to refactoring, this is only the beginning. However, these are my go to strategies when I need to make changes to seemingly nonsensical code and have no idea where to start. Work piece by piece and you'll soon find yourself with a whole.

Eat Your Vegetables! A Lesson in Code Review

Code review can be like eating vegetables. It can be boring, tedious, and sometimes the source of bitterness. Plus the entire time you're tempted to just half-ass it because "Oh I'm sure QA will catch any errors" (you do have a QA process, right?). 

But just like vegetables I think it's safe to say that code review is not only necessary but also good for you. The time required to do is greatly outweighed by the benefits:
  • Increases the Bus Factor of the code base by exposing more team members to code
  • Provides learning opportunities to both the reviewer and reviewee
  • Just the fear of public humiliation discourages developers from cutting corners
Plus there's much more, but that's not really the point here. You know that you need to do code review but how do you even start? Here's a few tips for doing code review with your team:

Choose a Format

Ok, everyone stab as many bugs as you can

Round table session:
A small team gets together and reviews the each of the commits done during the day. The author of the commit explains what's going on in the code while the rest of the teams listens. Generally I think the code should speak for itself, but we were in consulting working on different projects so a little context helped figure out what was trying to be done. However, this can be really disruptive because you basically need to scheduled out an hour or so everyday to have this meeting. Also, some people may be reluctant to speak up at a group meeting.

Gatekeeper:
A designated person (or people) are assigned the task of doing code review for the team. I only recommend this style of coding if you're using outsourced developers and you are having trouble getting them to do code review. You lose a lot of the benefits of code review this way, but it's better than nothing. Also if the gatekeeper is doing any coding then who, uh, gatekeeps(?) the gatekeeper?

Individual Commit Review:
At least two other members of the team individually reviews your code . This allows people to do code review on their own time. However, going through EACH commit can be a little tedious. 

Pull Request Review:
Your pull request is reviewed by at least two individuals. This one is my favorite, as it allows for something cohesive be reviewed as opposed to just chunks of code. However, don't let too much time pass between pull requests or you're going to have MASSIVE amounts of code to review, allowing for great opportunities of things to slip through unnoticed.

Lay Some Ground Rules

The first rule of code review is no fighting.
Fight club is Tuesday after the requirements gathering meeting.
Now that you've chosen a format, make sure to lay some ground rules to help keep code review focused on what's important. Here are a few suggestions:

DO NOT make any personal comments about the author of the code
Nothing constructive comes out of comments like, "This is so dumb, why would you do it like that?" Instead, ask constructive questions like, "Can you restructure this to use X?" This opens the door to a discussion instead of an argument, allowing the author to explain themselves or see the error.

DO NOT take any criticism personally
On the flip side, don't immediately get defensive when someone criticizes your code. The team doesn't don't (or at least shouldn't) care who wrote the code; they just want it to be done well. Take criticism as an opportunity to become a better programmer.

There must be reasoning behind your criticism
Vague complaints like "I don't like it" or "It doesn't feel right" are not valid criticisms. This is not to say you should just stay quiet if something seems awry, but you need to give some direction as to what needs to be done. I can't do anything with just "I don't like it." Also, saying something is not "best practice" is not enough; you need to say why isn't it a best practice. In fact, check out this delightfully cute book on fallacious arguments to help you figure out what's a good argument.

Decide what's in the scope of code review
This is a little more subjective, but your team will need to agree upon what's subject to code review or you may find yourself squabbling over trivial issues. Do you care about whitespace? (you should) Do all web pages need to be mobile friendly? These things come down to establishing coding standards (a whole different discussion), that serve as a (flexible) baseline as to what will be discussed. 

Code Review Checklist

This is definitely not all-inclusive, but it's generally what I look for while doing code reviews (for Salesforce projects)

Triggers
  • Is the trigger bulk safe?
  • Is it safe from recursion?
  • Has the been logic abstracted out of the trigger?
  • Does the trigger use the correct trigger context?
    • e.g. Are updates to non-trigger objects done in the after context?
  • Have all of the trigger contexts been tested?
  • Can the trigger be done with a workflow instead?
Visualforce
  • Have repetitive layouts been abstracted out into a component?
  • Has Javascript/CSS been moved to a public static resource to take advantage of caching?
    • There are exceptions to this such as needing to use merge fields
  • Does the CSS use !important (they shouldn't)
  • Are custom labels used as opposed to hard coded text? (Especially important for multi-language orgs)
Governor Limits
  • Are all SOQL calls outside of loops?
    • Remember to check if methods that contain SOQL are not called in loops
  • Are all future calls outside of loops?
  • Are all callouts outside of loops?
  • Are all DML calls outside of loops?
Tests
  • Do the tests have valid assertions?
  • Have the creation of test records been abstracted out?
  • Are all public methods tested?
Miscellaneous
  • Are all comments necessary and valid?
  • Is the code as modular as possible?
    • If you have to scroll to see what a method does, it's probably too long
  • Have inputs been sanitized as necessary?
    • String.escapeSingleQuotes is your friend!
  • Is everything only as visible as needed?
    • I default to private before making something public
And for our Managed Package friends
  • Are any namespaced items called by a string?
    • Salesforce will change record.MyField__c to record.myNamespace__MyField__c, but it will not be able to pick up record.put('MyField__c', aValue) and you'll get an exception at run time.
  • Are you sure you want that method/class to be global?
  • Are new fields the correct data type?
    • This is always important, but in a Managed Package it's forever.
Again, this is not all inclusive and it should be treated as a guideline, not as a strict rule. There are exceptions to everything and sometimes you have to skip things and take the technical debt in order to get something out in time. 

So get out there and start that code review! Because while it's not always fun to eat your veggies, you'll wished you did when things blowing up and you are all backed up.

The Ant Jobs Go Marching

When I was just a padawan Salesforce developer, ant was something I heard about a lot, but didn't really understand. And for a long time, I didn't really need to. Eclipse (and then later Mavensmate) handled all of my interaction with the metadata, I ran tests in the Salesforce UI, and I used change sets to deploy code between sandboxes. As far as I knew, ant was just another way to do the things I was already doing that just seemed overly complicated for no good reason.

Fast forward a few years later and I start working on a managed package at a new company. My typical MO for starting a new project was as follows:

  • Pull down the metadata from version control
  • Open up the folder in Sublime Text/Mavensmate
  • Add the sandbox credentials to create the connection to the org.

That last step, however, proved problematic: there are no sandboxes for managed packages, just the managed package org itself. And I couldn't work directly in there; that would be tantamount to working directly in production, a huge no no. When I asked my supervisor where I should work on the feature:

"Oh, just use ant to deploy it to a new developer org"
"Sure... ant. That's a thing that I know."

In retrospect, I should have just asked, but I was so nervous about starting this new job and didn't want to ask what I thought was a dumb question.

Finding a straightforward answer turned out to be surprisingly difficulty, so hopefully this will help someone in the future.

A Beginner's Guide to the Salesforce Ant Migration Tool

Following these steps will help you get started with using ant from the command line. I am primarily a Mac/Linux user, so I'm not sure how much of this will translate to Windows.

Step 1: Install Ant

First you want to check if you even have ant on your computer. I think Ubuntu and Mac OSX both have this pre-installed because I don't remember ever installing it myself. Either way, you'll want to check to make you sure you have the latest version:

Linux Users:

sudo apt-get update
sudo apt-get install ant

Mac Users:

brew update
brew install ant

The first command updates your package list so that the latest versions of software are retrieved. The second command installs or updates ant itself.

Step 2: Install the Force.com Migration Tool

These are the directions copied from Salesforce's documentation :

  • Log into a Salesforce organization on your deployment machine.
  • From Setup, click Develop | Tools, and then click Force.com Migration Tool.
  • Save the .zip file locally and extract the contents to the directory of your choice.
  • Copy ant-salesforce.jar and paste into your Ant installation's lib directory. The lib directory is located in the root folder of your Ant installation.

I usually gave up at the last step because I didn't know how to find my ant installation. Run this command to find it:

yourUsername@YourComputer:~$ whereis ant
ant: /usr/bin/ant /usr/bin/X11/ant /usr/share/ant /usr/share/man/man1/ant.1.gz

After checking through each of those directories, I found my lib folder in /usr/share/ant/lib . I use Codeship to automate testing so I actually put ant-salesforce.jar in a folder named lib in the base of my project, i.e. the same folder where the src folder sits for my Salesforce project. Either is fine. In fact, I typically just put it in the same folder as my project so that whoever pulls down the project from version control can quickly starting using the ant commands.

As for what's actually in the .jar file, don't worry. Just know that this is the file that allows you to use the ant commands created by Salesforce.

Step 3: Create your build.xml file

Create a file called build.xml at the base of your project. Here's a copy of a simple build.xml file:

<project basedir="." xmlns:sf="antlib:com.salesforce">
<taskdef resource="com/salesforce/antlib.xml"
classPath="lib/ant-salesforce.jar"
uri="antlib:com.salesforce"/>
<property file="build.properties"/>

<target name="deploy" description="Deploys the code to salesforce.com">
<echo message="Deploying to Salesforce..."/>
<sf:deploy username="${username}"
password="${password}"
serverurl="login.salesforce.com"
deployRoot="src"
runAllTests="true"/>
</target>

<target name="test" description="Deploys the code to salesforce.com">
<echo message="Running all tests in Salesforce..."/>
<sf:deploy username="${username}"
password="${password}"
serverurl="login.salesforce.com"
deployRoot="src"
checkOnly="true"
runAllTests="true"/>
</target>

<target name="retrieveCode">
<sf:retrieve username="${username}"
password="${password}"
serverurl="login.salesforce.com"
retrieveTarget="src"
unpackaged="src/package.xml"/>
</target>
</project>

Here are some definitions for the xml tags to describe what is going on in here:

  • taskdef : defines where your ant-salesforce.jar file is located
  • target : defines a command
    • name : the name of the command. These can be named anything you want, I could have put name="doTheThing" and it would wouldn't matter.
  • echo : Text that is displayed on the command line when starting the command. This is useful for letting the user know what is going on.
  • sf:deploy : This command is defined by that .jar file. As you can probably guess, this will deploy code to Salesforce. The attributes for this tag defines the user credentials to log into Salesforce as well as directions on what to deploy.
    • deployRoot : Defines where to deploy from. Above, it is set to src , so all the metadata under the src folder will be deployed.
    • checkOnly : When set to true , the command will only validate that the metadata included in src can successfully be deployed (compiled) without actually changing anything on the server.
    • runAllTests : This runs ALL of the tests in the org, including the tests in managed packages. Currently there's no way to only run non namespaced tests, which can be very annoying.
  • sf:retrieve : This is another command defined by that .jar file. This pulls metatdata down from an org.
    • retrieveTarget : where to store the retrieved metadata
    • unpackaged : where you package.xml file is located. More on the package.xml file later

Step 4: Set up your properties file.

You'll notice that for username and password that actual credentials are not provided, but instead have merge fields. If you're not planning to version control this file, then you can replace {$username} with your actual username. However, if you are putting this in version control, and your version control is hosted online (like with GitHub) then you'll want to put that information in a file that is included in your .gitignore file. The property tag defines where your merge fields are defined. I called the file build.properties, though I'm pretty sure you can call it whatever you want.

Add a file called build.properties at the same level as your build.xml file. It'll look like this:

username  = [email protected]
password = passwordAndSecurityToken

IMPORTANT: Do NOT add this file to version control or push it online. That can be a very costly mistake . If you do accidentally commit the file, immediately change your password.

Step 5: Setup your package.xml file

Your package.xml file defines what is being pulled down from and being to deployed to Salesforce. You've probably seen this while working with Eclipse or Mavensmate.

Here's a really basic file that Eclipse builds by default

<?xml version="1.0" encoding="UTF-8"?>
<Package xmlns="http://soap.sforce.com/2006/04/metadata">
<types>
<members>*</members>
<name>ApexClass</name>
</types>
<types>
<members>*</members>
<name>ApexComponent</name>
</types>
<types>
<members>*</members>
<name>ApexPage</name>
</types>
<types>
<members>*</members>
<name>ApexTrigger</name>
</types>
<types>
<members>*</members>
<name>StaticResource</name>
</types>
<version>32.0</version>
</Package>

The types tag define the kinds of metadata you can pull down. The members tag define which files to pull down. In this case I used a wildcard to pull down everything of each type. For more information, checkout Salesforce's documentation on Metadata types

After all that, you should be done! Now open up your project's base folder and try some ant commands. I suggest running one that has checkOnly = "true" so that you don't accidentally push something you didn't want to.

ant test
test:
[echo] Running all tests in Salesforce...
[sf:deploy] Request for a deploy submitted successfully.
[sf:deploy] Request ID for the current deploy task: 0Afo000000E4G8VCAV
[sf:deploy] Waiting for server to finish processing the request...
[sf:deploy] Request Status: Pending
...

Why bother?

If you are working with managed packages, ant is absolutely necessary for merging and deploying code between dev orgs and managed package orgs (Because you're not working directly on the package org, right?). It's also extremely useful when working with teams. You can easily deploy changes made by other teammates by pulling down their branches from version control and using ant to update your org.

If you are working in sandboxes for one org, ant can make deployments between sandboxes and to production easier as well. If you need to deploy profile changes, I recommend sticking to change sets, but for everything else ant can be much more efficient. Building change sets can be extremely tedious, clicking on EVERY file you need deployed. You can more quickly create a src folder and a package.xml file with all of the changes you want to deploy, all locally on your computer.

This should be enough to get your started with ant, but as you get more comfortable with the tool I suggest checking out the Force.com Migration Tool Guide for more functionality.