Recently, I had the opportunity to work on legacy code with several teams from various organizations. I would like to share my experience.
We usually start by choosing a piece of code that is “painful”: changing frequently and “scary” to touch because of its complexity. We explain that our purpose is to make the code simpler, readable, and easy to change. Establishing the motivation for what we do is important!
In essence, the steps we take are:
- Use extract method/rename to make the code more readable (specifically applicable for very long methods).
- Write and execute the first unit test (that’s usually the toughest part) as described by Michael Feathers.
- Add more unit tests until the area you want to refactor is satisfactorily covered.
- Refactor to make the code more maintainable (working in very small steps, as described by Joshua Kerievsky).
- Make the required change using TDD.
The purpose of (1) is to see the forest, not the trees. Long methods tend to be unreadable. Using the “extract method” helps you see clearly what’s going on. Once you gain vision, you can start to rename. Arlo Belshee talks about this.
As an example, look at these two statements:
At the first, if statement, we have extracted the condition to a method. Remember that what you do most of the time with code is read it. You need to make it readable.
Item (2), as mentioned above, is the difficult part. You need both to master the technique and have the resolution to do it. I usually do it with the entire team and so, together, we have the required courage.
For instance, take a look at this behemoth method.
Pure fun, eh? This is something I call an amusement park method. We usually start by trying to call it null. Sometimes it actually works and we have a first-unit test. Then we start to slowly fill in the parameters. Maybe instead of a null send an empty dictionary. Maybe instead of an empty dictionary send a dictionary with two entries. And if there’s no choice sometimes we run the actual application and serialize the parameters, to be deserialized in the unit test later.
Sometimes we change a method from private to public, sometimes we add a method to better control a member, and there are more vicious things we do. Sometimes it can take a whole morning to do this. However, once you understand this, it becomes very simple.
Then you start looking at coverage.
Once you have the first test, things start to move faster (3). You start adding more and more tests. You start looking at coverage reports to see which lines of code are covered and which aren’t. If something is not covered, you can add another unit test to cover it.
Now (4) we can start to make bigger changes. Once you have the unit tests in place you feel free. You make a small change, you run the test. Another small step and the tests run again. Some IDEs have plug-ins that run the tests every time something is changed.
This is the time to get better familiar with the automatic refactoring tools of your IDE. Make sure you are familiar with introducing parameters, fields, and variables. Extract class is a very nice one and so is the ability to convert a method to static and move a method. The trick here is to make as fewer manual changes as possible and move the code around fluently.
Many times by this point, there is a small disappointment. The code you feared in the morning now looks quite simple. The real challenge is making the code simple and solving the puzzle.
Now we reached the point when we can quite easily add some code to fulfill a new requirement (5). We can add a new test, see it fail, make the required change, see it pass, and maybe do a little refactoring. Nothing like the joy of seeing unit tests turn from red to green.
(the above are unit tests from a very nice exercise called Gilded Rose)
And that’s it.