Thursday, January 10, 2013

Why I like the ternary operator

My friend Mike (over at CodeAwesome) and I have a long standing good-natured debate about the nature of the ternary operator.

For the uninitiated, the ternary operator is basically syntax sugar in the C family of languages (and Java, since Java is Just A Fancy C++ VM) that allows you to inline if statements.

Basically:

var foo;
if(getGodlyGlobal(me.margaret)) {
   foo = "baz"
} else {
   foo = "wtf"
}

Becomes:

var foo = getGodlyGlobal(me.margaret) ? "baz" : "wtf";

To me, this feels concise and elegant. But to Mike, it seems to go dangerously in the direction of Perl line noise. I'm sympathetic.

Robert Martin bring up the excellent point, in his Tour de Force Clean Code, that we should Avoid Mental Mapping. He uses this concept specifically with regard to variable/function nomenclature, but it has larger implications in software engineering as well. This concept jibes well with research literature of Cognitive Load Theory in psychology.

Essentially, the more "noise" we create for our brains by having to map compressed pieces of information to larger meaning, the more difficult it is to understand what's going on. This may be why, to many, elegant models for statistics/quantum dynamics just look like "alphabet soup." Too much information compressed into a tiny space requires a lot of outside context in order to make heads or tails of.

Here, I have to know the syntax of the ternary operator, specifically that I replace the if with "?" and the else with ":". This mapping doesn't feel too complex, but any time I have to stop and stare quizzically at a piece of code, I have an opportunity to misunderstand or wasting time I could be using to develop new features.

Some languages, such as Coffeescript, elegantly handle this issue by allowing inline if/else in the evaluation of expressions. That seems reasonable. It obviously biases a programming language (with roots based in math) towards English, but then again, it's not a lot of English. If we embrace Donald Knuth's Literate Programming and try to make our code as expressive as possible, the presence of words is helpful.

Typically I use the ternary operator in a situation like the preceding, where I would just set a variable. But I just experienced another interesting use case, which is probably in line with the thoughts of the Anti-If campaign.  Consider the following.

https://gist.github.com/4507897#file-buildcontext-java

This method doesn't look too complicated. It's only 13 lines, though we can clearly notice that despite being named "buildContext" its only doing anything related to a context in the last line (another method call). The rest is actually reading values from a RequestContext. A problem with naming? Sure. But also take a look at lines 40-47.

My first instinct, on looking at this, was to refactor that into a function. It's a fairly self-contained closure that's essentially assigning a variable. Looking at it further, we can see both blocks are setting the same attribute on the request, with a different value in each branch.

Aha. A violation of DRY. There's one argument.

More interestingly, there's the cognitive notion of branching. Because I have a branch in logic here, I have to look both places to determine what will happen. Ultimately, the same thing will happen, just slightly different. But it's easy to imagine this changing, isn't it?

What if a 2 am programming emergency where to come through on a recognition problem? Maybe, if something is recognized, it should be logged. 3 Months down the line, someone decides unrecognized pieces should be logged as well, but recognized pieces should calculate  value. In 6 months, the whole RequestContext is replaced with some other logic.

In that kind of code churn, it's very possible to misplace a line or two. Especially something like setting the "pageIdent" attribute in one branch, but not in the other. Perhaps the other branch was refactored into a method, then the method was tweaked, and somewhere along the way the line got lost.

Branches in logic, by definition, attract Change. They invite bifurcation and inevitably confusion. One could argue that this problem could easily be removed by moving the request.setAttribute() outside of the branches...but that's the point. Exactly the point!

In programming, since Programming is Life, it's easy for things to fall into the wrong scope. Some naive developer placed duplication in those scopes without understanding that an implicit invariant was in place: no matter which branch of the if is taken, the "pageIdent" attribute should be set on the request. It can be easy to lose the forest for the trees, especially if you're not following Uncle Bob's First Rule of Functions and have hundreds of lines and multiple nested blocks.  

Blocks are cognitive magnets for confusion. 

Now consider this snippet:

https://gist.github.com/4507897#file-improvedbuildcontext-java

Straightforward. One can argue I cheated a little bit by refactoring the RequestContext parsing and Validation into super class's method, but even if we add those lines in, we see we've eliminated a branch in logic. Now it's perfectly clear: we're setting parameters on the request, and one particular parameter has two possible values. If we want to change what happens in each branch with this, the optimally logical thing to do is to extract that line out into a method and expand it out into its full if form. But in so doing, we've created an isolation point where such change can be processed easily without influencing surrounding logic. By eliminating a branch, we've manifested our invariant more clearly.