De Morgan's Flaw: Perform This Refactor Carefully
A pattern that I've noticed often causes bugs
by Danver Braganza on 2021-03-29
A common pattern for bugs
The other day, I caught myself on the verge of introducing a bug into my code. The type of bug was not subtle, and it was not the first time I’d made the same mistake before. Like any good software entomologist, I wanted to name this class of bug, describe its characteristics, and do what I could to prevent it occuring again.
I’ve dubbed this kind of bug a De Morgan’s Flaw, because they come about from applying De Morgan’s Laws incorrectly. Note that the name is not meant to impute any flaw to Augustus De Morgan himself. The laws are completely sound.
They are simple translations that allow you to write the “not of an or as an and of two nots, or vice versa”. I have to admit, I had too much fun saying that out aloud just now! Because that is probably as clear as mud:
not (a or b) == (not a) and (not b)
not (a and b) == (not a) or (not b)
The general pattern is also very simple to remember.
At every place
X
in the following expression, introduce a boolean negation if there isn’t one (or, equivanently remove one if there is), andReplace AND with OR, or vice versa
X ( X a {AND|OR} X b )
Once you learn this pattern, it becomes incredibly useful in refactoring unwieldy code. Take for instance a line like:
while not (stopped or duration > time_spent):
# Handle the contents of the loop here
While cleaning it up, you might want to remove that initial negation, so that each clause of the predicate becomes easier to hold in your reader’s head.
Applying De Morgan’s Law is pretty straightforward, and gives us
while running and timespent <= duration:
# Handle the contents of the loop here
Notice that we also took the opportunity to spell not stopped
as running
and
flipped the direction of our range
check–thereby
reducing the cognitive burden on our reader. This is something you definitely want to do.
The devil in the details
The danger lurking while you make this refactor is that it rarely happens in
isolation from other changes as you work on your code. Such a rewrite is usually
initiated when the predicate is the process of being redefined or the value of
one of the input booleans is flipping. For instance, not stopped
changed to
running
above. As the wave of that change propagates through the code, it
carries with it enough activation energy to cause a De Morgan flip of this
conditional–with a chance of making a mistake, because there is already so much
going on at the same time.
For instance, in the example above, not only did we switch from an or
to an
and
, but we changed around the contents of each boolean sub-expression. This
is the risk. It is exactly when I’m juggling
eggs in this manner that
I’m most susceptible to slipping up and introducing a bug.
The other problem that leads to this kind of bug evading our unit tests is that the number of cases to test increases at two to the power of the number of variables. And if you missed flipping the sign for one of these variables, their truth tables will continue to agree in half of the cases! This means you can partially test your code, even with 100% line coverage, and not notice the flaw you introduced.
Don’t (write fast or not have unit tests)
There are two simple remedies to avoid introducing a De Morgan’s Flaw in your code. The first is writing slow, and the second is making sure you have sufficient automated testing.
By writing slow, I mean limiting the amount of refactor that is in flight at a time. You can do this by breaking down your refactor into simple mechanical chunks and running the code through your test suite in between each iteration. Although it is slower, it is more error-resistant to leave yourself a comment about your next change, and only make a partial change to your code.
Having sufficient coverage of the code in your tests will also catch a De
Morgan’s Flaw pretty quickly. However, the effort to fully specify the truth
table for every combination of booleans in your code is often prohibitive.
Sometimes, the booleans that go into these predicates don’t have the same level
of significance–I once found a De Morgan’s Flaw hiding in code that mishandled
a boolean for the variable dry_run
.
Finally, I hope that by simply having a name for this phenomon will give you the awareness of when one of these is about to happen, and help you spot them in code reviews.
Other articles you may like
- Misapplying LazyRecursiveDefaultDict A cautionary tale of how I misapplied the wrong software tool to a problem, and what I've learned from it.
- Always spell your range checks like this By always spelling your range conditionals start <= index < capacity, you make your code more easily grokkable and reduce the chance of errors.
- The Empty Case is not Special Instead of explicitly handling the empty case in functions, try this instead.