Programming: Refactoring
You might want it, but can you?
Refactoring is a common practice to improve your code, either to add flexibility or reduce a technical debt. However, depending on your tools, but also your coding practices, it may be a more or less easy, efficient and quick task.
Be refactorable
The first thing to achieve a performant refactoring is to make it possible. This basically implies two things:
- use the proper tools;
- have your code ready to be consumed by those tools. Yes, not every code is equal in regard to refactoring.
Tools: IDE
I have known times where no refactoring actions were available, since we were using plain text editors or very limited IDEs. Actually, “refactoring” was not a word we used, as we wanted to avoid such tasks: changing a function signature, for instance, might require a huge amount of tedious editing of each call of it. You had to find all of them as text occurrences 😱 (ok, I know some of you still look for code symbols using text search nowadays, either by ignorance of your IDE capabilities or because of some weird masochist behavior, but you shouldn’t do that). Same for renaming: the cost of updating all references manually, especially in an interpreted language, was not always worth the risk to producing many runtime errors.
Hopefully in the 2000s, the first refactoring-enabled IDEs (like Visual Age for Java or the infamous Eclipse rival of Sun products) started to emerge and boosted both coding productivity and maintainability, since there was no more excuses to avoid refactoring (yes, there was no JavaScript IDE at this time, since most “serious” developers wouldn’t touch JS with a stick before a decade).
In 2003 I started to use IntelliJ IDEA (not yet under the JetBrains umbrella) and this was a mind blowing experience: that tool was so powerful in terms of navigation, completion and inspections, but also in terms of refactoring (it was even able to apply patterns on code) with the then-new concept of intentions, which are refactoring suggestions. With a single keystroke, you were able to split code, invert a boolean expression, rename, move… actually the whole tool was designed for refactoring from day one, since it started as a JBuilder renaming plugin. It was also focused on productivity, with a heavy emphasis on keyboard shortcuts, so that you could never loose time using a mouse again.
Since then a number of sophisticated text editors (the late Atom, Sublime Text…) and IDEs have emerged with similar capabilities (but never on par), Visual Studio/Visual Studio Code (which is the Eclipse successor to me, in terms of “I use it because everybody use it because it’s free”) and of course AI completion (which is included in JetBrains IDEs as well).
But enough speaking of products history, let’s get to the point about tools: in order to be efficient in refactoring, you need first to:
- use an IDE that supports refactoring actions like renaming, moving, signature change, extracting functions/methods, inlining code, safe deletion and regular expressions replacements at least.
- know about those IDE capabilities: too many times I see developers under-using their tool, at worse by performing refactoring tasks manually, and at best by performing them through menu clicks instead of using intentions or shortcuts. The time you’ll spent on learning and training these productivity techniques will be a profitable investment, I can tell you that.
Code: Cohesion
This uses to be a metric checked on classes to make sure they don’t break the SRP: the more class members (i.e. attributes and methods) are inter-dependent, the less they are likely to handle different responsibilities.
While this is less often mentioned, the same rule applies to plain code lines: the more variables and function calls are related, the more they should be collocated. For example, check the code below:
let x // Declaration
doSomething()
x = getX() // First usage
doSomethingElse()
useX(x) // Second usage
Here, doSomething()
and doSomethingElse()
are located between usages of the x
variable, but they are not related to x
(they don’t have side effect on it). While this is perfectly valid code and might not seem a big deal, this prevents extracting all those uses of x
in a getAndUseX()
function. Simply put, the lack of cohesion of this code block prevents it to be refactored.
Instead, the same instructions can be rearranged in an order that outlines two sets of independent, cohesive statements groups:
doSomething()
doSomethingElse()
const x = getX() // Declaration and first usage
useX(x) // Second usage
Now that the related lines are collocated, the cohesive parts can be extracted into a function:
getAndUseX() { // Extracted function
const x = getX()
useX(x)
}
doSomething()
doSomethingElse()
getAndUseX() // Call to extracted function
A similar example of non-cohesive code that is the improper location if conditional checks:
const x = getX() // Fetch var
if (condition) { // Condition not using var
use(x) // Action using var
}
Those three lines might look simple, but can actually raise three problems:
- about performance:
getX()
might be performing (now or later) a time-consuming action. Ifcondition
is not met, this time will be spent for nothing. Imagine if this code is looping a thousand times. - about side effects: if
getX()
performs an action that changes some state, (adds a log in a database, sends a message somewhere, whatever), it will be performed whereas it probably shouldn’t. - about refactoring (which is the focus of this article): you won’t be able to extract the code related to
x
because of thecondition
in between.
Instead, a more performant code with no unsolicited side effects would be:
if (condition) {
const x = getX() // Fetch var
use(x) // Action using var
}
with the bonus benefit to allowing the method extraction refactoring:
function getAndUseX() { // Extracted function
const x = getX()
use(x)
}
if (condition) {
getAndUseX() // Call to extracted function
}
Code: Avoid multiple returns
Another common code structure which prevents refactoring is the common practice of early return
s:
parseDate(dateStr) {
if (!dateStr) {
return undefined
}
const groups = dateRegexp.exec(dateStr)
if (groups) {
return new Date(groups.year, groups.month, groups.dayOfMonth)
}
return undefined
}
This practice aims at avoiding “else” blocks and the implied indentation. However, by defining multiple exits, it:
- complicates debugging by requiring to set break points on multiple code lines if you want to check exits of the function. As time goes, you’ll probably encounter this infamous situation of waiting eons for a breakpoint to be reached, before realizing that it never was because the function actually reached a
return
before. - complicates logging, since there is three different locations where the result should be logged, if required;
- prevents extracting parts of it in a separate function (if that part has itself multiple returns);
- prevents inlining the function itself, since a code block can only have a single flow.
Instead, the code above can be rewritten with a single exit:
parseDate(dateStr) {
let date
if (dateStr) {
const groups = dateRegexp.exec(dateStr)
if (groups) {
date = new Date(groups.year, groups.month, groups.dayOfMonth)
}
}
return date
}
Using exactly the same number of lines — with the cost of one indentation — you wrote a code that eases debugging and logging, can be inlined, and which parts can be extracted.
Code: Avoid optional parameters
Optional parameters in functions/methods are very convenient; as attractive as sugar, and as nocive as well. I try not to use them for a number of reasons but one of those reasons is that I will have a hard time refactoring those signatures in languages which rely on parameters order (i.e. don’t support named parameters).
Take, for instance, this code using a function f
with an optional parameter:
function f (a, c?) {
console.log("a=", a, ", c=", c)
}
f("a", "c") // Prints "a=a, c=c"
Now suppose you want to insert a parameter. If types are not known or just match (successive string-typed params for instance), the IDE function signature refactoring might have no clue about an incorrect call that should be updated:
function f (a, b, c?) {
console.log("a=", a, ", b=", b, ", c=", c)
}
f("a", "c") // Prints "a=a, b=c" !
If you never encountered this, you probably don’t use IDE signature refactoring capabilities 👎 or you don’t use optional parameters 👍.
Instead, you can either avoid using optional parameters (i.e. pass undefined
explicitly) or use named parameters to avoid confusion. If your language doesn’t support it (such as JavaScript), you can still use a named parameter surrogate pattern, through the use of a value object:
function f (params) {
console.log("a=", params.a, ", b=", params.b, ", c=", params.c)
}
f({a, b}) // Prints "a=a, b=b, c=undefined"
f({a, c}) // Prints "a=a, b=undefined, c=c"
Cohesion is also for files
Now let’s zoom out from code and consider another kind of project asset: files. As you know concepts are often composed by multiple files, such as code (such as MyConceptController
, MyConceptService
, MyConceptStorage
…), style (MyConcept.css
), documentation (MyConcept.md
), tests (MyConcept.test
) and so forth for composed sub-concepts in sub-directories.
What if you want to perform refactoring on such an concept, like renaming or moving it? Depending on where those files are located, you can imagine the hassle difference between looking for them in a single myconcept/
directory or in several controllers/myconcept
, services/myconcept
, controllers/styles
, etc.
You got it: even with files, cohesion is important. Files should be collocated as well.
Refactoring cases
Now that you set up your code in a way that is refactorable, you may encounter a number of motivations for refactoring. Here are a few that you may encounter.
Invariants
Sometimes you’ll find that something is constant, not influenced by the calling context, that is, parameters:
class MyClass {
constructor(value) {
this.value = value
}
method(params) {
const thing = new Thing(this.value) // Never depends on params
use(thing, params)
}
}
Each method()
call above will create a fresh instance, which (unless having a new instance has a side effect) means that this thing is part of the invariant of the object class. As a result, the code can then be refactored as:
class MyClass {
/**
* @readonly
*/
thing
constructor(value) {
this.thing = new Thing(value)
}
method(params) {
use(this.thing, params)
}
}
This way you will:
- simplify code;
- improve performance, by avoiding unnecessary re-instantiation;
- improve reliability by ensuring the invariant part cannot vary (i.e. is read-only).
Block comments
It is quite common in “long” code blocks to see comments added to clarify what a group of lines does:
...
// This code fetch a user and initializes its messages
user = fetchUser(userKey)
messages = fetchMessages(user.locale)
userReady(user)
...
Now think a minute about the reason why you commented those lines inside the block. Obviously because this wan’t clear enough. Also because those lines formed a logical group of instruction that could be summarized with your comment.
So, why isn’t this group a dedicated function? Each time you do this, refactor the block as a function:
/**
* This code fetch a user and initializes its messages
*/
loadUser(userKey) {
user = fetchUser(userKey)
messages = fetchMessages(user.locale)
userReady(user)
return user
}
...
user = loadUser(userKey)
...
You comment will become a call of a function whose name could be exactly what you put in the comment. If too long, this could be the documentation of that function. This way, you’ll improve drastically the readability of your code by both adding semantics and documentation to a group of lines and reducing the size of the initial block. And don’t worry about performance, the VM can inline those things.
Resource management
Another common case is error management when acquiring and releasing resources.
Conclusion
We’ve seen a few refactoring cases but, more importantly, rules of thumb to keep your code refactorable. This notably implies to regularly watch for the cohesion of your code, whether it is class fields or plain code lines.
As a side benefit, writing code that is refactoring-ready actually leads to better readability.