The spelling error made 200 billion times a day

The spelling error made 200 billion times a day

Like anybody, I’ve made a fair share of spelling mistakes in my life. I was never particularly talented, having never gotten a spelling bee medal myself. Red underlines are simply a fact of life, with no real emotional reaction.

But fortunately, I’ve never made a spelling mistake so consequential that it’s been broadcast at least 200 billion times every day!

This is the story of the referer header, a part of every single web request, and why it’s spelled incorrectly. It’s fine, harmless, and incorrect.

What exactly is the problem?

When I started paying attention to such things, it was when I was setting up a log aggregation system several years ago. At that time, it was Graylog, and the default log parser used ‘referer’ to create the fields in the database where data extracted from server logs was stored.

Then, I tried to “get fancy” and include a custom field from our proxy logs. This required writing a brand new log parser, and extracting everything to the same field names so that dashboards and reports would continue to work. When this custom parser was being written, my text editor helpfully suggested I fix the misspelling by substituting it with referrer instead. Then all hell broke loose.

I’m sure this has happened before, perhaps hundreds or thousands of times to other inexperienced tech workers, even ones proficient in the English language. And despite my own weak grasp on my native language, it still bothers me just a little when I have to parse an HTTP request or look through a web server log.

How did this happen?

The root of this issue is deep. In fact, the very protocol specification for HTTP itself contains this error.

The referer field, quite simply, is the location that you arrived from. It’s not strictly required, but is a helpful way for a server to know how a request found the page it’s trying to load. For analytics purposes it is also quite handy. Without much effort, one can parse a server log and get a good idea where the bulk of their inbound traffic sources. There are also some famous examples of sites giving bogus information to visitors that came from certain undesirable sources.

The really strange thing is that the spelling mistake was actually caught in an email sent in March of 1995, when the final version of the specification was published over a year later in May of 1996!

> Has anyone else noticed that the HTTP header "Referer:" is spelled wrong?

That's okay, neither one (referer or referrer) is understood by "spell"
anyway.  I say we should just blame it on France.  ;-)

It is true that spellchecking software was much more primitive 30 years ago, and it’s therefore possible it wasn’t able to correctly identify this. However, I don’t know if I buy that, because it’s a relatively common word in the English language. Not common enough to be in a Thing Explainer book, but it’s not uncommon.

It’s likely it was simply a translation issue, as alluded to in the email. In French the word is “référer”, and with ASCII encoding not supporting accents I could understand the mixup. Until Unicode was popularized, it was common for languages with accents to have problems.

In a Usenet post made September 2000, one of the co-authors Phillip Hallam-Baker remarked:

Its like when I did the referer field. I got nothing but grief for my choice
of spelling. I am now attempting to get the spelling corrected in the [Oxford English Dictionary]
since my spelling is used several billion times a minute more than theirs.

Regardless of the explanation, it’s surely here to stay.

Correctness: Less important than consistency

The problem is that if we were to try and fix this, it would probably take another 30 years or more.

For this to happen every single HTTP library, which there are several thousands, would have to be updated universally.

Every PC, laptop, cell phone, and smart dog collar would need firmware updates. Every industrial Programmable Logic Controller with a 30 year service lifecycle would have to be taken out of service and reprogrammed. Every aircraft and vehicle that utilizes a Linux operating system, including the ones on other planets, would need full compliance with the new protocol. Not to mention the havoc this would unleash on log parsing servers like what I blew up back then!

For full backwards compatibility, which we should always strive for, it would be prudent to support both during the transition period. In that case, requests would have to look like this, at least for a few decades:

POST / HTTP/1.1
Host: example.com
Content-Type: text/html
Content-Length: 123
...
Referer https://nbailey.ca
Referrer https://nbailey.ca

And you can’t argue that it’s better like this, can you!?

The real crux of it is that consistency is much more important than correctness on its own. Whether or not it creates a red squiggle in Word is sort of irrelevant, what matters is its acceptance among other systems that interoperate using this protocol.

Ultimately, “correctness” is decided by consensus rather than by authority. Whether we like it or not, referer is here to stay.