The Sorting Hat and Hash Functions
Why Harry Potter's Sorting Hat would make a poor choice for a hash function
by Danver Braganza on 2020-12-06
The Sorting Hat from the Harry Potter series by J. K. Rowling captivated me the first time I was introduced to it. This was long before I had any computer science training. The idea of a magical hat that could plumb the innermost depths of your soul and reveal truths about your personality had me hooked as well as millions of children the world over. It speaks to a need we all have for self-knowledge, that also explains the enduring popularity of such personality tests as the Enneagram and Meyers-Briggs Type Indicator, or MBTI, especially among adolescents and young adults1.
I recently re-watched the entire Harry Potter series, and it jumped out at me that the Sorting Hat could be viewed as a hash function in Computer Science. I thought I’d take a little time to explore the properties of the Sorting Hat, the properties of an ideal hash function, and how well the Hat would fare if put to work, say in indexing our Hash Table or as a load-balancer for HTTP requests.
The Sorting Hat in its own words
Any good algorithm should have sufficient documentation so that a programmer can understand what it does and how to use it. J. K. Rowling has the Hat describe itself in The Philosopher’s Stone with a song.
“Oh you may not think I’m pretty,
But don’t judge on what you see,
I’ll eat myself if you can find
A smarter hat than me.
You can keep your bowlers black,
Your top hats sleek and tall,
For I’m the Hogwarts Sorting Hat
And I can cap them all.
There’s nothing hidden in your head
The Sorting Hat can’t see,
So try me on and I will tell you
Where you ought to be.
You might belong in Gryffindor,
Where dwell the brave at heart,
Their daring, nerve, and chivalry
Set Gryffindors apart;
You might belong in Hufflepuff,
Where they are just and loyal,
Those patient Hufflepuffs are true
And unafraid of toil;
Or yet in wise old Ravenclaw,
if you’ve a ready mind,
Where those of wit and learning,
Will always find their kind;
Or perhaps in Slytherin
You’ll make your real friends,
Those cunning folks use any means
To achieve their ends.
So put me on! Don’t be afraid!
And don’t get in a flap!
You’re in safe hands (though I have none)
For I’m a Thinking Cap!”
—Harry Potter and the Philosopher’s Stone, J. K. Rowling
That could be more concise, but it covers the bases adequately. The input is your head, the output is one of the four Great Houses of Hogwarts, and it does that by analyzing your internal state.
Sorting vs Ordering
At first blush, it appears that the Sorting Hat is misnamed. The word
computer science specifically means to put a collection in order, for a given
definition of “in order”. Per the expectations of computing terminology, a
Sorting hat, then, would arrange a group of magic students in a row, or assign
them a numbering. If Dumbledore had the students line up in order of the marks
they scored in their final exams, for example, that would
According to the vernacular definition of sorting, however, the Hat is named perfectly well. Most non-programmers understand sorting to mean assigning things to a bin, or putting them into separate piles. You may sort your laundry into separate hampers, or use multi-million dollar machines to sort letters and packages to go to different locations, as the USPS famously does. There’s even a MIT-licensed library named sortmail that’s available on most UNIX distributions.
I was surprised to learn that the computer science use of the term
ordering is something of a historical quirk that was actually derived from the
more general definition. During the era of programming on punched cards, Card
Sorters could be used
to assign cards into one of 12 pockets based on the value in a particular
column. Programmers would use columns 73-80 for the purpose of numbering their
cards. If they ever accidentally got their order shuffled, they would use these
machines to restore the original order. By putting an entire stack of cards
through the machine, and going through multiple passes, it was possible to use
“sorting” in this sense to “order” your cards. Over time, the naming stuck, and
our field came to refer to ordering as sorting.
Now that we’ve discussed that, we shall continue to refer to the Sorting Hat as such, but knowing that we’re referring to the operation that programmers call hashing.
How the Sorting Hat measures up
Thus we’ve seen that a hash function assigns an input value to one out of a small set of bins. Some of the uses for hash functions are in building hash tables, caches, associative arrays and load balancers.
Let’s look at each of the properties of a good hash function, and see how well the Sorting Hat meets the requirement.
You should get always the same result if you run the same hash function on the same input multiple times. Or, stated another way, if you put the Sorting Hat on the same student’s head, you should always get the same house.
How you score the Sorting Hat on this requirement depends on your philosophy of free will. Many students perceive their Sorting as a conversation with the Hat, with them arguing or begging the hat to give them a certain result. If people may change their personality and outlook over their life, then it is possible that they would no longer hash into the same bucket, and you’re going to have some hard-to-debug errors in production.
We accept axiomatically that a faster program is a better program. As I write this, I’m conscious that I’ve never questioned this assumption; the obsession with performance is deep in our profession. Something to think about another day.
Hash functions are rarely the application themselves, but are often used as an intermediate step to get the outcome the user wants. Moreover, they’re often used extensively, and so they are called many times per interaction with users or other machines. Therefore, the performance of the hash function can have a big impact on the overall performance of the program.
The one exception I’m conscious of here is in some applications of cryptographically secure hash functions. In these cases, if the hash is too easy to compute it becomes possible for a determined attacker to run a brute-force attack on inputs until they receive a hash collision.
The Sorting Hat fares poorly in this regard. Although it allegedly sorts Malfoy into Slytherin “instantaneously”, it takes a while for Harry Potter, and even longer for Neville Longbottom. For a worst-case scenario, Professor Minerva McGonagall was considered a Hats tall, where it took over 5 minutes to assign her a house.
Most people would be happy to wait even longer for a decision that is going to affect the rest of their life. However, as a component of our program, the Sorting Hat is not a good fit.
Even worse, the variance between these timings exposes an attack vector to our enemies, if we had any. Imagine if Voldemort was able to send children of his Death Eaters to Hogwarts. He could study the characteristics of the students that keep the Hat busy, and send multiple students like that in one batch. With enough students, he could potentially tie up the Sorting Ceremony for the whole year.
It is a requirement that the hash function assign a fairly even distribution from the inputs to the outputs. This is because in most applications, we want to make sure that the number of elements in each bucket are roughly balanced. The more balanced the buckets, the more efficiency we see overall. In general a larger bucket slows us down more than a similarly smaller bucket speeds us up.
Time for another tale of a historical quirk. At one point in the history of PHP,
the hash function that was used to store functions themselves was just
Put differently, the functions were bucketed by the length of their name. The
names programmers naturally choose for their functions are definitely not
uniformly distributed, especially if the programmers adhere to a consistent
naming convention. So several function names, such as the function
htmlspecialchars(), were intentionally abbreviated idiosyncratically to keep
the buckets balanced.
Judging by the movies, it seems that year after year the Houses stay roughly the same size. So it this regard, at least, the Sorting Hat seems to do a passable job.
We like having hash functions that can operate without loss of generality on a large number of inputs. This means that we can implement our hash function once, and then reap the benefit of applying it in different contexts. At the same time, there is a great value in having a specialized hash function that only works for one domain, if its other advantages in that domain beat others.
In this regard, I’m going to give the Sorting Hat a pretty high score. Of course, it is a hat, so it only accepts input in the form of human heads. However, given the complexity required to read the human mind, it stands out as state-of-the-art.
We don’t yet have a way to read a person’s personality directly by connecting an apparatus to their head. Although perhaps the data about your consumption habits available to online advertising platforms may be approaching that level of insight into the transparency of individuals. Nevertheless, in this domain, the Hat stands head and shoulders above the rest.
The Sorting Hat only sorts people into one of four houses. That is an output space of a paltry two bits. Of course, it is Magic, so perhaps it could scale up to an arbitrary number of buckets upon request. Or perhaps it would only do so at the behest of Godric Gryffindor and the other founders2.
Unless it is possible to extend the output range, the Hat would not be suitable for most real-world applications. For this category, the score of the hat is undetermined.
In conclusion, if you possess a Sorting Hat, I recommend against using it as a hash function. Although it meets the definition of a hash function, its behaviour and characteristics as described in the books make it a poor candidate for any real-world workloads.
However, you stand to gain a lot of money from personality and executive talent assessments if you make the hat the centrepiece of your program. Your one competitive advantage over the others in this crowded field will be that you can say your personality assessment is literally Magic, and your clients will have to take you at your word.
The lesson here is that you just need to find the job that fits the tool. This is a lesson that the Sorting Hat teaches by example.
Credit to this post by Eric Riese for kicking off this idea.
- When I was in college the MBTI was a common topic of discussion among my fellow students. Today, the criticism of the validity of some of their research gives me pause, but back then we were willing to accept it as a tool for exploration of our selves that was, at worst, a harmless digression. ↩︎
- Unlike another well known hat, it’s not clear if the Sorting Hat is Open Source ↩︎
Other articles you may like
- De Morgan's Flaw: Perform This Refactor Carefully A pattern that I've noticed often causes bugs
- Misapplying LazyRecursiveDefaultDict A cautionary tale of how I misapplied the wrong software tool to a problem, and what I've learned from it.
- From Maybe to Ensure A digression on naming functions, and ensuring they only do one thing