Tuesday, March 10, 2015

Good OOP == Good Dependency Management

Object-Oriented Programming is all about dependency management. Dependency management is crucial for building evolving systems that survive and adapt to contextual changes.

It's a paradigm that scales. Even the new "container" based virtualization technologies such as Docker are basically treat "Operating System" as an Object. Even Javascript's "Revealing Module Pattern" is just making a namespace of bound functions...at some point when you're passing the same 3 parameters to 5 functions you start thinking "wait. What if I just passed those parameters to 1 function that called those 5?" and you're on your way to OOP.

Good OOP is about seeing the relationships between things and figuring out how things actually come together. Your software is a jigsaw puzzle, a Lego set. The principles of good software construction all come down to good dependency management.

When you see the right relationships, you minimize the dependency between unrelated things, form logical partitions, and find traversals through the object graph that you completely replace at sub-graphs without any loss of generality.

When you don't, you create bloated, bureaucratic, hard to read and understand systems that don't work. Hopefully at least they fail in obvious ways. It's far more difficult to fix when failures aren't obvious.

Thursday, January 10, 2013

Why I like the ternary operator

My friend Mike (over at CodeAwesome) and I have a long standing good-natured debate about the nature of the ternary operator.

For the uninitiated, the ternary operator is basically syntax sugar in the C family of languages (and Java, since Java is Just A Fancy C++ VM) that allows you to inline if statements.

Basically:

var foo;
if(getGodlyGlobal(me.margaret)) {
foo = "baz"
} else {
foo = "wtf"
}

Becomes:

var foo = getGodlyGlobal(me.margaret) ? "baz" : "wtf";

To me, this feels concise and elegant. But to Mike, it seems to go dangerously in the direction of Perl line noise. I'm sympathetic.

Robert Martin bring up the excellent point, in his Tour de Force Clean Code, that we should Avoid Mental Mapping. He uses this concept specifically with regard to variable/function nomenclature, but it has larger implications in software engineering as well. This concept jibes well with research literature of Cognitive Load Theory in psychology.

Essentially, the more "noise" we create for our brains by having to map compressed pieces of information to larger meaning, the more difficult it is to understand what's going on. This may be why, to many, elegant models for statistics/quantum dynamics just look like "alphabet soup." Too much information compressed into a tiny space requires a lot of outside context in order to make heads or tails of.

Here, I have to know the syntax of the ternary operator, specifically that I replace the if with "?" and the else with ":". This mapping doesn't feel too complex, but any time I have to stop and stare quizzically at a piece of code, I have an opportunity to misunderstand or wasting time I could be using to develop new features.

Some languages, such as Coffeescript, elegantly handle this issue by allowing inline if/else in the evaluation of expressions. That seems reasonable. It obviously biases a programming language (with roots based in math) towards English, but then again, it's not a lot of English. If we embrace Donald Knuth's Literate Programming and try to make our code as expressive as possible, the presence of words is helpful.

Typically I use the ternary operator in a situation like the preceding, where I would just set a variable. But I just experienced another interesting use case, which is probably in line with the thoughts of the Anti-If campaign. Consider the following.

https://gist.github.com/4507897#file-buildcontext-java

This method doesn't look too complicated. It's only 13 lines, though we can clearly notice that despite being named "buildContext" its only doing anything related to a context in the last line (another method call). The rest is actually reading values from a RequestContext. A problem with naming? Sure. But also take a look at lines 40-47.

My first instinct, on looking at this, was to refactor that into a function. It's a fairly self-contained closure that's essentially assigning a variable. Looking at it further, we can see both blocks are setting the same attribute on the request, with a different value in each branch.

Aha. A violation of DRY. There's one argument.

More interestingly, there's the cognitive notion of branching. Because I have a branch in logic here, I have to look both places to determine what will happen. Ultimately, the same thing will happen, just slightly different. But it's easy to imagine this changing, isn't it?

What if a 2 am programming emergency where to come through on a recognition problem? Maybe, if something is recognized, it should be logged. 3 Months down the line, someone decides unrecognized pieces should be logged as well, but recognized pieces should calculate value. In 6 months, the whole RequestContext is replaced with some other logic.

In that kind of code churn, it's very possible to misplace a line or two. Especially something like setting the "pageIdent" attribute in one branch, but not in the other. Perhaps the other branch was refactored into a method, then the method was tweaked, and somewhere along the way the line got lost.

Branches in logic, by definition, attract Change. They invite bifurcation and inevitably confusion. One could argue that this problem could easily be removed by moving the request.setAttribute() outside of the branches...but that's the point. Exactly the point!

In programming, since Programming is Life, it's easy for things to fall into the wrong scope. Some naive developer placed duplication in those scopes without understanding that an implicit invariant was in place: no matter which branch of the if is taken, the "pageIdent" attribute should be set on the request. It can be easy to lose the forest for the trees, especially if you're not following Uncle Bob's First Rule of Functions and have hundreds of lines and multiple nested blocks.

Blocks are cognitive magnets for confusion.

Now consider this snippet:

https://gist.github.com/4507897#file-improvedbuildcontext-java

Straightforward. One can argue I cheated a little bit by refactoring the RequestContext parsing and Validation into super class's method, but even if we add those lines in, we see we've eliminated a branch in logic. Now it's perfectly clear: we're setting parameters on the request, and one particular parameter has two possible values. If we want to change what happens in each branch with this, the optimally logical thing to do is to extract that line out into a method and expand it out into its full if form. But in so doing, we've created an isolation point where such change can be processed easily without influencing surrounding logic. By eliminating a branch, we've manifested our invariant more clearly.

Monday, December 31, 2012

Questions from the Mailbag: On Static Imports

"Something I noticed today while I was looking over the Mockito API...they suggest importing the library statically. So I did, and I noticed JUnit was imported statically as well in Eclipse, but other classes/libraries in my Unit Test class were not. What I'm asking then is why do the static import, why does it seem to be important in testing, and when might one want to do it otherwise, if at all?"

Excellent question!

Static imports are basically used in Java to allow C/C++ style programming. I fucking hate them. Essentially, they are a feature that was added because programmers are lazy and hate to type. Rather than referencing the class a static method is pulled from, with a static import you can save yourself a few characters.

The argument against them is that they pollute your namespace and are not explicit.

Polluting your namespace means that, if you have a method with the same name as a statically imported method, the compiler will get confused. This may not be a problem, depending on the class in question. For example, org.junit.Assert's assertEquals() is unlikely to appear in many places. But if you were writing the class OpinionatedCalculator, you may have a problem.

The bigger problem here is that namespace collision is obviously more likely with classes that have more methods. If I statically import something like org.apache.commons.lang.StringUtils I'm more vulnerable to running into problems with a common word like "contains()."

One could argue this is because StringUtils is too large: in theory a static import on a small class, with relatively few methods (say 3) is "relatively safe."

I say fuck that theory. It's not that much more work to type Assert.assertEquals() than assertEquals(), and it saves me from having to look up at the top of the file to figure out where some magical global method comes from. In that sense, I say it's much more "explicit": The code is obvious the second you look at it.

You know I like my code like I like my porn:
Explicit is Better Than Implicit.
http://www.python.org/dev/peps/pep-0020/

That said, if you browse through enough of my code on github, you'll find instances where I statically import JUnit. No one's perfect. Sometimes I flip on the opinion based on how lazy I feel like being that day, because I have the option to. Hence why I say it's a bad language feature, and I shouldn't have the option to.

Of course, opponents to that claim would say that code is longer and hence uglier with extra noise words. If the class name is unnecessary and it's clear that the static import makes sense, favor brevity. But if there's one thing you should know about me by now, it's that I like to go on...and on...and on.

Fuck static imports.

Wednesday, November 28, 2012

A case study in expressive (domain specific language) programming.

I was asked to write a cache for a HTTP request. My first instinct was to add a static instance variable and be done with it. Then I thought about staleness. I could've added another field for a Date, and turned my class into an unmaintainable mess in trying to manage the freshness...instead I factored out.

I noticed, as I was writing the new class, that I wasn't in a dry, soulless programming mood. I could've described everything in terms of caches, maps, requests, buffers, etc...instead, I decided to make it like telling a story: the PreviousJourneys of Corey the HttpCourier.

I noticed that by using expressive names like rememberVisit() instead of putDataInCache(), and expanding out expressions like hasVisited(location) && tripTakenRecently() into a method--visitedRecently(location)-- I'm able to write a fairly comprehensible story.

It could use some work, of course. It's not perfectly consistent in names. The logic is 3 am logic. Technically it's probably better to make a Journey that has place, experience, and date, as it makes more sense to say "If I haven't visited X in 7 days I probably wanna see what's new there" rather than "I haven't been anywhere in 7 days so I'm going to X, damn it!"

But the point of the exercise is that I think most people could follow it.

I think this was the point of the Abbot Method, CRC cards, Use Cases, and all of OOP. It's what BDD is driving at. If you're writing code powerfully, it should tell the story of the system in a way that's obvious to most people who have some idea what's going on. Knuth's Literate Programming is probably quite far along the spectrum to Codevana...maybe even farther than LISP. gasp

Wednesday, September 5, 2012

DON'T WRITE CODE LIKE THIS OR A POOH BEAR WILL PUNCH YOU.

Think about Winnie the Pooh.

Winnie is the sweetest, kindest, lovable ol' Pooh Bear you ever did see.

Winnie the Pooh makes me happy.

You know what doesn't make me happy?

Code that looks like this.

That code looks pretty complex, to new eyes. Especially if you're looking at it after trying to figure out why someone else's library, which you've cloned, doesn't pass all of its regression tests. You might look at that and have a few moments of doubt. Concern. Do I understand how this library is used?

Then you stare at it for a minute. You realize that the intent of this Miracle of Indirectness is to find if a key is contained in a collection of registered services. That's not hard.

One can easily think of doing that in a way that makes sense. You basically have a Map between Service and Configuration. If your test is supposed to modify,or not modify, one of these values, you can easily ask.

Instead, this test prefers to go through the list once, in place, instead of building the map. Ah. Efficiency. How delightful. Except (pun-ish-ment) that the loop uses JUnit's assertEquals() in the body of the try block, which throws a ComparisonFailure when the condition is not true.

This means that we're using Exceptions for flow control. DON'T USE EXCEPTIONS FOR FLOW CONTROL!!!

They expect exceptions to be set up often but thrown rarely, and they usually let the throw code be quite inefficient. Throwing exceptions is one of the most expensive operations in Java, surpassing even new. On the other hand, don't forget the first rule of optimization (RulesOfOptimization).

So, in trying to be efficient, we've completely destroyed the point of efficiency? Well, at least the bottom of the method is self-documenting.

I know, I know. I shouldn't care so much about what I do. "It's all bullshit anyway", right? If it works, yay! If not, "make it work, whatever it takes." It's not in fashion to try to be, ya know...Elegant. Precise. Crisp.

Whatever. It's not my library. I don't have to care. I just have to try to make a contribution that will fix my problem, and if it inadvertently breaks half the world, e.g. hundreds of other packages that depend on this one....FUCK EM! Right?

Write tests for your new code. Don't worry about the horrible legacy you're trodding upon...But then I have to look at code like this, and just ignore it.

WTF bro? That's not even a test. If you're going to test for the persistence of a dynamically generated file in a transient environment, that's hard. But probably doable. It involves actually writing a condition to test.

But writing tests like this is worse than just leaving fail(). This test, naively, just by looking at a report, looks like it's doing something intelligent. Then you look at it. You know what this causes?

He's coming for your face, honey. He's.coming.for.your.face.

Sunday, August 12, 2012

The Parable of Little Jimmy

Gather round, children. I’d like to tell you a story. A story about a boy named Jimmy.

Meet Jimmy

Jimmy was a sweet young boy from a quiet Midwestern town. Jimmy grew up loving sports, playing his guitar, and volunteering at the old folks’ home. Jimmy met Sally, the love of his life, in high school. When he dropped her off and kissed her good night after taking her to prom, he knew they were Promised. They’d be together forever. He proposed, after graduation, and they were wed. Sally was 6 months pregnant when Jimmy had to go.

You see, Jimmy loved his country. He believed in freedom, duty, honor, and helping people. So Jimmy enlisted to serve his country and protect those he loved. Mom and Dad were so proud of him. Sally was worried, but proud of him too. 19 years old, but already he was such a good man.

Jimmy Goes to War

Jimmy joined the Air Force. He wanted to be a helicopter pilot. To be called on to support the cavalry and save his injured brothers. He worked really hard, suffered through boot camp and daily struggles with humility and determination. He made pilot, and everyone was thrilled.

One day, Jimmy was flying routine reconnaissance over the hills of Afghanistan when the unthinkable happened. A terrorist with an RPG came out of the woodwork and fired at his chopper. His sensors didn’t pick it up until the last minute. The rocket struck, and he went into a tail spin. In shock and trying to recover, he managed to crash the chopper into the soft side of a hill and escape only with a broken leg and some bruises. He dragged himself out of the wreckage and found a nearby cave for shelter, praying for his comrades to come find him.

Save Jimmy!

He didn’t think it would take long. Jimmy’s chopper and combat suit were equipped with Next Generation Defense Technology. The technology included location-aware services. In case of an emergency, his technology should report issues to central command. They would come for him soon. His life would be saved.

Unfortunately, what Jimmy didn’t know was that his technology from the very beginning was marginal. The devices that took the location readings were sometimes flaky, though good 90% of the time. This time, a sensor failed. But it didn’t fail by just breaking…an electrical shortage caused its readings to overflow. It kept collecting numbers, but the numbers were just noise.

No Problem…right?

The programmers who had developed the location based technology considered themselves pretty smart and capable people. They had a job to do, and it was their job to get it done as quickly and efficiently as possible. So when the developer who was working on the geolocation piece found an article by IBM describing the characteristics of such a system, his task seemed easy. http://www.ibm.com/developerworks/java/library/j-coordconvert/

All he had to do was just copy paste the code, write some tests for the common scenarios, and he was done. After all, it was a blog post on IBM teaching about the stuff. It had to be right. Right?

Not So Fast…

But people forget that tutorial code is often written as a proof of concept. Most authors themselves adamantly state that they are showing an idea, but the code is “not production worthy.” But how do you determine “production worthy?” Besides, this is just some small utility feature in a much larger system. So nobody thought twice about the LatZones class https://gist.github.com/3251291. Copypasta, tests written for common workflows that pass, problem solved.

They system was used for a long time by a lot of people. The Air Force trusted it with soldiers’ lives. During the acceptance testing phase there’d been some political issues, but officers were able to negotiate an adaptive solution and get it through, albeit quickly. The system was in place for a while, and seemed to work. After a while, it became a given, and nobody gave pieces like LatZones too much thought.

So when Jimmy’s sensor kept throwing negative infinity at the sensor, and the automated rescue dispatcher that had been introduced into the system sent soldiers to Zone A, nobody questioned it. They combed through the area, but found nothing. Confused, nobody could figure out what happened.

The End of Jimmy

Jimmy, meanwhile, ran out of survival rations. Desperate, hungry, and in pain, he tried to figure out where he was and make his way back to civilization. When he finally despaired that they weren’t coming for him, he tried to set off for civilization. Exposed and in the open, he was caught by the terrorist cell that shot down his chopped. They tortured him, and eventually killed him.

Who’s to Blame?

The tutorial author, for not writing code that someone else could just copy and paste from a blog post? The programmer for trying to do a good job under a tight deadline? The ones rushed the system through acceptance? Was it just a bit of bad luck, something we can do nothing about?

No

Jimmy didn’t have to die. Jimmy didn’t have to spend his excruciating last moments in the depths of interminable pain, thinking only of Sally and crying out to God for release. Jimmy could’ve lived.

How It Might Have Been

A software engineer tasked with developing a geolocation system may have used the IBM article as a reference point. An informative source for thinking about the concerns such a system should be able to address. He might’ve started by copy/pasting pieces, but he would’ve stopped to think about the different possible control flows of execution. What the potential values of parameters could be, and should be. He might’ve started writing unit tests for these combinations, discovered the weirdness of this calculation, and fixed it.

Alternately, he might’ve come up with an entirely different design that met the same constraints, without messy double calculations in tightly bounded ranges. When getLatZones() got negative infinity, it might return an exception. The system might’ve tracked such exceptions, and if it noticed they were thrown repeatedly within a time interval it might’ve used a different strategy, perhaps triangulating based on historical data.

A software engineer could’ve saved Jimmy’s life.

But It’s Gotta Get Done!

I’m the kind of person who’s OCD enough that I obsess over this kind of stuff even for my rat fuck website that nobody gives two shits about. But I know the trade-off between “has to get done so we can survive” and “optimal”, so I too will cut corners to “get the job done.” That’s just being human. “Pragmatic,” some might say.

You know what? That’s cool. It’s ok when it’s my rat fuck little website if I miss two days’ worth of visitor traffic or write out an incorrect temperature. Sure, I’m not too happy. I missed some revenue or I looked like a fool. But in the end the stakes weren’t that high.

I’m sure that someone could write a sob story about how the loss of revenue led to me going out of business; leading to a butterfly flapping its wings in Japan and Global Warming. I’m sure someone else could characterize this story as similar hyperbole. But the stakes are different in different situations, and they matter. If some financial analyst will act like it’s the end of the world because his report has commas where there should be periods, and we’re willing to be meticulous to avoid such “disastrous gaffes”, it seems reasonable to extend at least that level of care to Little Jimmy.

This story might seem extreme. When you Care About Your Craft and try to THINK about programming, it’s easy to get passionate and offended--morally offended-- when you see something like this. Being the ones who build core components of large systems, it’s easy to forget how the system is used in its periphery. But that is the most important and most delicate place to be. There’s no place for copy-paste coding in framework development.

Bottom Line

Don’t blindly copypasta you don’t understand when the stakes actually matter. Treat your craft with at least the same respect and care that you demand from your short-order cook for your meals. If you’re working on a task where people’s lives could be on the line, try to remember Little Jimmy.

Software Architecture and the Parable of the Megapode

This is a story best told in Douglas Adams' delightful voice.

But, if you don't have the time or motivation to listen to a few minutes of adventure, here's the transcription.

"I have a well-deserved reputation for being something of a gadget freak. And I'm rarely happier than when spending an entire day programming my computer to perform automatically a task that it would otherwise take me ten seconds to do by hand. Time is valuable, and ten seconds worth of it is well worth the investment of a day's happy activity working out a way of saving it.

The bird we came across was called a megapode, and it has a very similar outlook on life. It looks a little like a lean, spritely chicken. Though it has an advantage over chickens that it can fly, if a little heavily, and is therefore better able to escape from dragons, which can only fly in fairy stories--and in some of the nightmares through which I was plagued while trying to sleep on Kimodo.

The important thing is that the megapode has worked out a wonderful labour saving device for itself. The labour it wishes to save is the time-consuming activity of sitting on its nest all day, incubating its eggs, when it could be out and about doing things. I have to say at this point that we didn't actually come across the bird itself, though we thought we glimpsed one scuttling through the undergrowth. We did, however, come across its labour-saving device, which is something that is hard to miss.

It was a conical mound of thickly packed earth and rotting vegetation. About 6 feet high, and 6 feet wide at its base. In fact, it was considerably higher than appeared, because the mound would've been built on a hollow in the ground, which would itself have been about 3 feet deep.

I just spent a cheerful hour of my time writing a program on my computer that would tell me instantly what the volume of the mound was. It's a very neat and sexy program, with lots of popup menus and things, and the advantage of doing it the way I have is that in on any future occasion on which I need to know the volume of the megapode nest, given its basic dimensions, my computer will tell me the answer in less than a second, which is a wonderful saving of time! The downside, I suppose, is that I cannot conceive of any future occasion that I am likely to need to know the volume of a megapode nest...but, no matter! The volume of this mound is a little over 9 cubic yards.

What the mound is, is an automatic incubator. The heat generated by the chemical reactions of the rotting vegetation keeps the eggs that are buried deep inside it warm. And not merely warm! By judicious additions or subtractions of material to the mound, the megapode is able to keep it at the precise temperature which the eggs require in order to incubate properly.

So, all the megapode has to do to incubate its eggs is merely to dig 3 cubic yards of earth off the ground, fill it with 3 cubic yards of rotting vegetation, collect a further 6 cubic yards of vegetation, build it into a mound, and then, continually monitor the heat it's producing and run about adding bits or taking bits away. And thus, it saves itself all the bother, of sitting on its eggs from time to time.

This cheered me up immensely."

The next time you're tasked with a big, hairy ball of mud on your task list, ask yourself..."Am I being a Megapode?"

Software Meditations