From Maybe to Ensure
A digression on naming functions, and ensuring they only do one thing
by Danver Braganza on 2020-11-15
Every function should do exactly one thing. This is a software engineering maxim that was transmitted to me during my education in Software Engineering and repeated in numerous books on programming. My belief in it continues to be strengthened through experience.
Of course even simple functions are composed of many machine instructions, and would be of very little use if they weren't. However, functions exist to create a level of abstraction, an informal set of conceptual building blocks. In most programs you will write, these levels are not clear cut and defined by algebra, geometry or physics, but loose and malleable, driven by your domain model and the needs underlying your business logic.
Therefore, you should expect the definition of "One Thing" to admit some wiggle room. Nevertheless, it is a very effective heuristic to apply to your code. It's a fairly easy test to apply, and functions that clearly fail it are very strong candidates for refactoring. Try not to worry about the edge cases.
An even stronger version of this test is that functions should never have a
conjunction in their name. A name should clearly express what the function does
at its level of abstraction; if that name has a
_and_ in it, that means that
that function is failing to do one thing by its own admission.
I've worked on code-bases with chronic problems that were caused largely by functions with names like:
create_or_update_xxx() login_or_signup_user() find_and_transform() maybe_cache()
These problems went away once we were able to refactor these functions to only do one thing, which was easy, and then change their call sites to use these new functions, which was not. This, like many warnings against vice, is a much better preventative than a cure.
Given that, I was called to reflection by a co-worker's code review, where he pointed out that I'd written a group of functions with the word "maybe" in their names. It was in a piece of code that handled integration with a legacy system that was loose with types, and so these function existed to perform type conversion--if needed.
A typical candidate looked like
maybe_convert_to_int(var: Any) -> int, and its body
would check whether the incoming variable had the right type, or if needed,
would apply the necessary conversion.
At first, I was puzzled and dismayed. To me, it was obvious that these functions were necessary because they are dealing with unreliable data. However, the way that I had named them, and therefore justified them to their calling clients, expressed an uncertainty as to what they actually did. Maybe convert to int? And despite what I'd said earlier, about not sweating the edge cases, I was beginning to doubt my own advice.
Then it dawned at me that I was committing the error of analysing these functions at the wrong level of abstraction. By focusing on how they operated, I was obscuring what exactly they did.
In this case, I was able to fix the problem by considering the reason you'd call
this function--to ensure that a certain variable always has a given type. Viewed
from that lens, the function only does one thing, it ensures that it returns an
int. So it should be called,
ensure_int(var: Any) -> int
After renaming my functions, I resubmitted my review and renewed my committment to the framework above.
The next time I run into a dilemma like this, it might require more than a simple rename. If my initial rough-cut of the function mixes two responsibilities from the consumer's level of abstraction, this simple lexical rule might trigger their disentangling. That would be well worth it.
Maybe your functions are doing one and only one thing. Ensure that, when viewed from the persective of your consumer, they are.
Other articles you may like
- Misapplying LazyRecursiveDefaultDict A cautionary tale of how I misapplied the wrong software tool to a problem, and what I've learned from it.
- The Sorting Hat and Hash Functions Why Harry Potter's Sorting Hat would make a poor choice for a hash function
- Zero-based ordinals This is the televenth post of my blog.