Case Sensitivity is an Anti-Pattern

Print Friendly, PDF & Email

Yeah, I get it. C, C++, C#, Java, JavaScript, TypeScript and many more are all case sensitive. So there must be a reason, and it must be a good one right?

Hell no. But it’s here, so let’s discuss the ElePHanT iN tHe rooM.

Trying to call x.Foo() instead of x.foo() will not work no matter how much we may wish it would. Fortunately the tools have evolved enough to help us with auto completion and better error messages.

In times past, case sensitivity could often cause frequent hard to find problems. In fact, some ASP.NET XML configuration files treat True as false, because the code to check it is the equivalent of:

Put option=”True” by mistake in the XML file, and it will treat it as false happily and willfully.

Case sensitivity must be treated as a specification, and not a feature.

As a Specification

What does this mean? Put quite simply it’s here so of course we have to deal with it. Case sensitivity simply adds more compliance costs, with no benefit.

As a Friggin’ Feature

Most junior coders just accept case sensitivity blindly. After all, everyone is doing it, so it must be right! It’s nothing but coding groupthink. Unfortunately many senior coders only vehemently defend case sensitivity, but declare it as a feature. This is called rationalization.

A long established practice that has roots in C and enshrined in C++ has spread to other languages. Fortunately this practice has largely become shunned, but its still exists and I’ve seen it even in Microsoft commercial code as well as others.

Capital Offenses

A few years ago I was flying and my boarding pass listed my name as chad Hower. When I got to the gate they prevented me from boarding the plane because the gate agent had learned C++ and my passport said Chad Hower. Of course this never happened, but humans don’t “read” capital letters, and introducing case into identifiers is like expanding the English language to have 52 letters instead of 26.

ReMembeR tHe ElePhAnt iN tHE rOOm? Would you mind if I wrote the whole article like that? Imagine if words were spelled like that and in school we had to not only learn spelling but special combinations of capitalization. The notion is counter intuitive to decades of common usage and is a practice that absolutely should be eliminated from the coding world.

Visiting a C++ Zoo

“Hey mom, we went to a zoo today! We saw a rAt bigger than me!” said Joe. “No way. Rats are small, you are just telling stories.” his mom replied. “No really,” Joe persisted, “We saw a RaT swimming under water too!”

Many languages such as Chinese have no concept of capital letters. When they learn English, they are not taught to treat Car differently than car except when it is the first word of a sentence or a heading. Either way, its is still one car, not two different things.

Other languages such as Russian in most fonts have far less variation of capital versus lower characters than English other than size. That is to say, many capital letters are simply bigger versions of the same character in most fonts. English has the same for some letters, but more of them differ. D is different than d, but Z is just a bigger z.

Take for example these two identifiers:

lo: int = 1;
Io: int = 1;

These are two distinct identifiers and are not the same. Typically fixed fonts used for coding make such characters more clearly distinguishable, but not always.

Even then, code is often shared via chat, email for forums where the font used can again become ambiguous. I have lost count the number of times I’ve had to hunt down bugs in others code that was a result of this. Especially so in languages like JavaScript which implicitly turn most unrecognized identifiers into new empty ones. Every typo becomes a new variable. Yeah! We have TypeScript thankfully to help, but JavaScript should be called what it is, TypoScript.

For coders who learn English and then use English as is commonly used in coding, Foo vs foo is just another obstacle that operates contrary to well established norms. Even to new English coders the concept is confusing.

Foo Fighters

This by far is the worst of case sensitivity problems. It’s a feature and we are going to use it!

Yes. You read that right. Foo foo = new Foo(); I still encounter this in C# where it is allowed but violates the .NET development guidelines and makes code practically impossible to use from VB.NET or other languages.

From this link:

Other Components Cannot Access Your Variable

Visual Basic names are case-insensitive. If two names differ in alphabetic case only, the compiler interprets them as the same name. For example, it considers ABC and abc to refer to the same declared element.

However, the common language runtime (CLR) uses case-sensitive binding. Therefore, when you produce an assembly or a DLL and make it available to other assemblies, your names are no longer case-insensitive. For example, if you define a class with an element called ABC, and other assemblies make use of your class through the common language runtime, they must refer to the element as ABC. If you subsequently recompile your class and change the element’s name to abc, the other assemblies using your class can no longer access that element. Therefore, when you release an updated version of an assembly, you should not change the alphabetic case of any public elements.

For more information, see Common Language Runtime.

Correct Approach

To allow other components to access your variables, treat their names as if they were case-sensitive. When you are testing your class or module, make sure other assemblies are binding to the variables you expect them to. Once you have published a component, do not make any modifications to existing variable names, including changing their cases.

Despite this, even in Microsoft written code I still too frequently encounter the foo Foo pattern.

Think I’m making this all up about C++ coders being religious? When someone on StackOverflow asked why C# is case sensitive, lets take a look at what got upvoted to the top answer.

I especially like the sentiment that having to use actual distinct identifiers without using case sensitivity is a get around. Some even go so far as to say that Kernighan & Ritchie (creators of C) built it in as a feature. Yet none of them have been every able to cite any support for this. Sure Kernighan & Ritchie specified C as case sensitive for reasons I will discuss below. I contend it was never added as a feature, but rather a practicality of the times.

Java has the same issue, and worse yet it appears to be an actual guideline:

Scott Hanselman even said this in 2005:

I spend an hour today debugging a possible problem only to notice that “SignOn” != “Signon”

If I had a nickel for everytime Case-Sensitivity or Case-Insensitivity bit me, I’d have like seven or eight bucks. Seriously.

How We Got Here

C was invented in 1972. There were no personal computers in 1972, and even million dollar computers had less memory and CPU than a $1 watch you can buy at a discount store today. Back then every single CPU operation mattered and had tangible effects on compiling code. To do case insensitive comparisons took extra time for the CPU. Multiply this by thousands or more, and in those days it made a very big difference in already glacial compile times.

Computer from 1950

Even by the mid 1980s, compiling a C program could take anywhere from minutes to tens of minutes. Just imagine what it was like in 1972. In 1972 this was cutting edge technology.

On 24 August 1972 INTEL released the 200 kHz 8008, an 8 bit version of the 4004.

The speed was 300.000 instructions per second, could address 16Kb memory. It contained 3500 transistors based on 10 micron technology. It was the first processor able to recognize all characters of the alphabet (letters and numbers).

300,000 CPU instructions per second and maxed out at 16kb RAM! That computer could barely hold the first few paragraphs of this article. And that is if the computer was full of RAM, which many were not. Many would have had less RAM.

That isn’t 3 GHz, not even 3 MHz. It’s 300 KHz! That is 0.3 Mhz, or 0.0003 GHz. You bet that in 1972 K&R knew what they were doing and made the right decision at the time to make C case sensitive. It was simply a matter of practicality, not some forward looking feature for C++ coders to spout off about 45+ years later. C++ adopted case sensitivity because C++ can compile most C code and compatibility mattered.

When Java, C#, JavaScript and other came along in the late 1990s and early 2000s, the developers just carried it over. Why? Because they were hoping to bring along the legions of C and C++ developers and now we are still suffering the consequences. It’s the same reason those languages also have C’s bug prone for statement,

instead of a simple Pascal or Basic style for loop:

C++ coders often defend their for loop, yet amazingly when they move to C# or other they all follow the advice to use foreach instead of for wherever possible. Fortunately foreach is usable in 99% of the cases and has helped marginalize the bug prone C style for loop.

It’s long past the time to appease C++ coders who are stuck in 1970s. Let’s move on. There are plenty of surplus computing parts from the 1980s on ebay if these guys are nostalgic.

Real Effects

It is garbage like this why despite the HTTP methods being case insensitive for PUT, DELETE, POST and GET, yet PATCH is case sensitive even in the latest Chrome and Firefox releases. This is just one example of many. I alone lost a full work day to this, and given the results in Google I’m not the only one. How much productivity are we losing to this garbage?

Coding in the Desert

here We are at another “i Can’t believe I have to debate this $**t.” despite Even Microsoft and .NET adopting Pascal Casing, others Are still holding to camelCasing As a feature.

camelCasing (or is it CamelCasing since its the first word in the sentence?) is much more common than foo Foo (can we just call it fufu?) and is the predominant continued practice in many languages today. Fortunately its problems are far less than fufu, but none the less it has detrimental effects especially when combined with the prevailing case sensitivity of languages today. The bigger problem though is that it really has no beneficial effects and simply used for the same reason as fufu: “We’ve always done it this way.”

How We Got Here

fufu and camelCase are essentially the same problem, or at least camelCasing seems to be caused by fufu. Wikipedia says the actual origin is unknown, but it seems logical and obvious to assume how it came about.

Coders had a variable named foo. Variable names no longer had to be short and programs became far bigger, so more distinctive names were needed. fooforfred was too difficult to read.

foo-for-fred was also tried, but most languages did not allow – to be used in identifiers. Next foo_for_fred was tried, but many languages did not support _, and _ requires a shift key so it was what I call shifty. That is, it required a lot of use of the shift key and was inconvenient to type.

It could have been adjusted normally to FooForFred, but that “offended the senses” of coders who had no programmed into their brains that variables must start with a lower case letter. So instead of doing things right, they did what was easy and we got fooForFred.

“I’ve never used a case sensitive language”

Are you saying you have never used HTML, SQL, or sent an e-mail? In addition VB, VB.NET, Pascal and several others are case insensitive.

Some databases offer case sensitive options, but DBA’s rarely ever turn that option on. Guess why?

Imagine if HTML were case sensitive. It is already such a joy to code, just imagine how much more time we could waste.

Is Google case sensitive? If someone searches for “pittsburgh”, should “Pittsburgh” be excluded?

Email addresses according to the RFC can be case sensitive, yet nearly every mail server treats them as case insensitive. Shall we guess again as to why?

Real Effects

It is garbage like this why despite the HTTP methods being case insensitive for PUT, DELETE, POST and GET, yet PATCH is case sensitive even in the latest Chrome and Firefox releases. This is just one example of many. I alone lost a full work day to this, and given the results in Google I’m not the only one. How much productivity are we losing to this garbage?