Monday, December 31, 2012

Questions from the Mailbag: On Static Imports

"Something I noticed today while I was looking over the Mockito API...they suggest importing the library statically. So I did, and I noticed JUnit was imported statically as well in Eclipse, but other classes/libraries in my Unit Test class were not. What I'm asking then is why do the static import, why does it seem to be important in testing, and when might one want to do it otherwise, if at all?"

Excellent question!

Static imports are basically used in Java to allow C/C++ style programming. I fucking hate them. Essentially, they are a feature that was added because programmers are lazy and hate to type. Rather than referencing the class a static method is pulled from, with a static import you can save yourself a few characters.

The argument against them is that they pollute your namespace and are not explicit.

Polluting your namespace means that, if you have a method with the same name as a statically imported method, the compiler will get confused. This may not be a problem, depending on the class in question. For example, org.junit.Assert's assertEquals() is unlikely to appear in many places. But if you were writing the class OpinionatedCalculator, you may have a problem.

The bigger problem here is that namespace collision is obviously more likely with classes that have more methods. If I statically import something like org.apache.commons.lang.StringUtils I'm more vulnerable to running into problems with a common word like "contains()."

One could argue this is because StringUtils is too large: in theory a static import on a small class, with relatively few methods (say 3) is "relatively safe."

I say fuck that theory. It's not that much more work to type Assert.assertEquals() than assertEquals(), and it saves me from having to look up at the top of the file to figure out where some magical global method comes from. In that sense, I say it's much more "explicit": The code is obvious the second you look at it.

You know I like my code like I like my porn:
Explicit is Better Than Implicit.
http://www.python.org/dev/peps/pep-0020/

That said, if you browse through enough of my code on github, you'll find instances where I statically import JUnit. No one's perfect. Sometimes I flip on the opinion based on how lazy I feel like being that day, because I have the option to. Hence why I say it's a bad language feature, and I shouldn't have the option to.

Of course, opponents to that claim would say that code is longer and hence uglier with extra noise words. If the class name is unnecessary and it's clear that the static import makes sense, favor brevity. But if there's one thing you should know about me by now, it's that I like to go on...and on...and on.

Fuck static imports.

Wednesday, November 28, 2012

A case study in expressive (domain specific language) programming.

I was asked to write a cache for a HTTP request. My first instinct was to add a static instance variable and be done with it. Then I thought about staleness. I could've added another field for a Date, and turned my class into an unmaintainable mess in trying to manage the freshness...instead I factored out.

I noticed, as I was writing the new class, that I wasn't in a dry, soulless programming mood. I could've described everything in terms of caches, maps, requests, buffers, etc...instead, I decided to make it like telling a story: the PreviousJourneys of Corey the HttpCourier.

I noticed that by using expressive names like rememberVisit() instead of putDataInCache(), and expanding out expressions like hasVisited(location) && tripTakenRecently() into a method--visitedRecently(location)-- I'm able to write a fairly comprehensible story.

It could use some work, of course. It's not perfectly consistent in names. The logic is 3 am logic. Technically it's probably better to make a Journey that has place, experience, and date, as it makes more sense to say "If I haven't visited X in 7 days I probably wanna see what's new there" rather than "I haven't been anywhere in 7 days so I'm going to X, damn it!"

But the point of the exercise is that I think most people could follow it.

I think this was the point of the Abbot Method, CRC cards, Use Cases, and all of OOP. It's what BDD is driving at. If you're writing code powerfully, it should tell the story of the system in a way that's obvious to most people who have some idea what's going on. Knuth's Literate Programming is probably quite far along the spectrum to Codevana...maybe even farther than LISP. gasp

Wednesday, September 5, 2012

DON'T WRITE CODE LIKE THIS OR A POOH BEAR WILL PUNCH YOU.

Think about Winnie the Pooh.

Winnie is the sweetest, kindest, lovable ol' Pooh Bear you ever did see.

Winnie the Pooh makes me happy.

You know what doesn't make me happy?

Code that looks like this.

That code looks pretty complex, to new eyes. Especially if you're looking at it after trying to figure out why someone else's library, which you've cloned, doesn't pass all of its regression tests. You might look at that and have a few moments of doubt. Concern. Do I understand how this library is used?

Then you stare at it for a minute. You realize that the intent of this Miracle of Indirectness is to find if a key is contained in a collection of registered services. That's not hard.

One can easily think of doing that in a way that makes sense. You basically have a Map between Service and Configuration. If your test is supposed to modify,or not modify, one of these values, you can easily ask.

Instead, this test prefers to go through the list once, in place, instead of building the map. Ah. Efficiency. How delightful. Except (pun-ish-ment) that the loop uses JUnit's assertEquals() in the body of the try block, which throws a ComparisonFailure when the condition is not true.

This means that we're using Exceptions for flow control. DON'T USE EXCEPTIONS FOR FLOW CONTROL!!!

They expect exceptions to be set up often but thrown rarely, and they usually let the throw code be quite inefficient. Throwing exceptions is one of the most expensive operations in Java, surpassing even new. On the other hand, don't forget the first rule of optimization (RulesOfOptimization).

So, in trying to be efficient, we've completely destroyed the point of efficiency? Well, at least the bottom of the method is self-documenting.

I know, I know. I shouldn't care so much about what I do. "It's all bullshit anyway", right? If it works, yay! If not, "make it work, whatever it takes." It's not in fashion to try to be, ya know...Elegant. Precise. Crisp.

Whatever. It's not my library. I don't have to care. I just have to try to make a contribution that will fix my problem, and if it inadvertently breaks half the world, e.g. hundreds of other packages that depend on this one....FUCK EM! Right?

Write tests for your new code. Don't worry about the horrible legacy you're trodding upon...But then I have to look at code like this, and just ignore it.

WTF bro? That's not even a test. If you're going to test for the persistence of a dynamically generated file in a transient environment, that's hard. But probably doable. It involves actually writing a condition to test.

But writing tests like this is worse than just leaving fail(). This test, naively, just by looking at a report, looks like it's doing something intelligent. Then you look at it. You know what this causes?

He's coming for your face, honey. He's.coming.for.your.face.

Sunday, August 12, 2012

The Parable of Little Jimmy

Gather round, children. I’d like to tell you a story. A story about a boy named Jimmy.

Meet Jimmy

Jimmy was a sweet young boy from a quiet Midwestern town. Jimmy grew up loving sports, playing his guitar, and volunteering at the old folks’ home. Jimmy met Sally, the love of his life, in high school. When he dropped her off and kissed her good night after taking her to prom, he knew they were Promised. They’d be together forever. He proposed, after graduation, and they were wed. Sally was 6 months pregnant when Jimmy had to go.

You see, Jimmy loved his country. He believed in freedom, duty, honor, and helping people. So Jimmy enlisted to serve his country and protect those he loved. Mom and Dad were so proud of him. Sally was worried, but proud of him too. 19 years old, but already he was such a good man.

Jimmy Goes to War

Jimmy joined the Air Force. He wanted to be a helicopter pilot. To be called on to support the cavalry and save his injured brothers. He worked really hard, suffered through boot camp and daily struggles with humility and determination. He made pilot, and everyone was thrilled.

One day, Jimmy was flying routine reconnaissance over the hills of Afghanistan when the unthinkable happened. A terrorist with an RPG came out of the woodwork and fired at his chopper. His sensors didn’t pick it up until the last minute. The rocket struck, and he went into a tail spin. In shock and trying to recover, he managed to crash the chopper into the soft side of a hill and escape only with a broken leg and some bruises. He dragged himself out of the wreckage and found a nearby cave for shelter, praying for his comrades to come find him.

Save Jimmy!

He didn’t think it would take long. Jimmy’s chopper and combat suit were equipped with Next Generation Defense Technology. The technology included location-aware services. In case of an emergency, his technology should report issues to central command. They would come for him soon. His life would be saved.

Unfortunately, what Jimmy didn’t know was that his technology from the very beginning was marginal. The devices that took the location readings were sometimes flaky, though good 90% of the time. This time, a sensor failed. But it didn’t fail by just breaking…an electrical shortage caused its readings to overflow. It kept collecting numbers, but the numbers were just noise.

No Problem…right?

The programmers who had developed the location based technology considered themselves pretty smart and capable people. They had a job to do, and it was their job to get it done as quickly and efficiently as possible. So when the developer who was working on the geolocation piece found an article by IBM describing the characteristics of such a system, his task seemed easy. http://www.ibm.com/developerworks/java/library/j-coordconvert/

All he had to do was just copy paste the code, write some tests for the common scenarios, and he was done. After all, it was a blog post on IBM teaching about the stuff. It had to be right. Right?

Not So Fast…

But people forget that tutorial code is often written as a proof of concept. Most authors themselves adamantly state that they are showing an idea, but the code is “not production worthy.” But how do you determine “production worthy?” Besides, this is just some small utility feature in a much larger system. So nobody thought twice about the LatZones class https://gist.github.com/3251291. Copypasta, tests written for common workflows that pass, problem solved.

They system was used for a long time by a lot of people. The Air Force trusted it with soldiers’ lives. During the acceptance testing phase there’d been some political issues, but officers were able to negotiate an adaptive solution and get it through, albeit quickly. The system was in place for a while, and seemed to work. After a while, it became a given, and nobody gave pieces like LatZones too much thought.

So when Jimmy’s sensor kept throwing negative infinity at the sensor, and the automated rescue dispatcher that had been introduced into the system sent soldiers to Zone A, nobody questioned it. They combed through the area, but found nothing. Confused, nobody could figure out what happened.

The End of Jimmy

Jimmy, meanwhile, ran out of survival rations. Desperate, hungry, and in pain, he tried to figure out where he was and make his way back to civilization. When he finally despaired that they weren’t coming for him, he tried to set off for civilization. Exposed and in the open, he was caught by the terrorist cell that shot down his chopped. They tortured him, and eventually killed him.

Who’s to Blame?

The tutorial author, for not writing code that someone else could just copy and paste from a blog post? The programmer for trying to do a good job under a tight deadline? The ones rushed the system through acceptance? Was it just a bit of bad luck, something we can do nothing about?

No

Jimmy didn’t have to die. Jimmy didn’t have to spend his excruciating last moments in the depths of interminable pain, thinking only of Sally and crying out to God for release. Jimmy could’ve lived.

How It Might Have Been

A software engineer tasked with developing a geolocation system may have used the IBM article as a reference point. An informative source for thinking about the concerns such a system should be able to address. He might’ve started by copy/pasting pieces, but he would’ve stopped to think about the different possible control flows of execution. What the potential values of parameters could be, and should be. He might’ve started writing unit tests for these combinations, discovered the weirdness of this calculation, and fixed it.

Alternately, he might’ve come up with an entirely different design that met the same constraints, without messy double calculations in tightly bounded ranges. When getLatZones() got negative infinity, it might return an exception. The system might’ve tracked such exceptions, and if it noticed they were thrown repeatedly within a time interval it might’ve used a different strategy, perhaps triangulating based on historical data.

A software engineer could’ve saved Jimmy’s life.

But It’s Gotta Get Done!

I’m the kind of person who’s OCD enough that I obsess over this kind of stuff even for my rat fuck website that nobody gives two shits about. But I know the trade-off between “has to get done so we can survive” and “optimal”, so I too will cut corners to “get the job done.” That’s just being human. “Pragmatic,” some might say.

You know what? That’s cool. It’s ok when it’s my rat fuck little website if I miss two days’ worth of visitor traffic or write out an incorrect temperature. Sure, I’m not too happy. I missed some revenue or I looked like a fool. But in the end the stakes weren’t that high.

I’m sure that someone could write a sob story about how the loss of revenue led to me going out of business; leading to a butterfly flapping its wings in Japan and Global Warming. I’m sure someone else could characterize this story as similar hyperbole. But the stakes are different in different situations, and they matter. If some financial analyst will act like it’s the end of the world because his report has commas where there should be periods, and we’re willing to be meticulous to avoid such “disastrous gaffes”, it seems reasonable to extend at least that level of care to Little Jimmy.

This story might seem extreme. When you Care About Your Craft and try to THINK about programming, it’s easy to get passionate and offended--morally offended-- when you see something like this. Being the ones who build core components of large systems, it’s easy to forget how the system is used in its periphery. But that is the most important and most delicate place to be. There’s no place for copy-paste coding in framework development.

Bottom Line

Don’t blindly copypasta you don’t understand when the stakes actually matter. Treat your craft with at least the same respect and care that you demand from your short-order cook for your meals. If you’re working on a task where people’s lives could be on the line, try to remember Little Jimmy.

Software Architecture and the Parable of the Megapode

This is a story best told in Douglas Adams' delightful voice.

But, if you don't have the time or motivation to listen to a few minutes of adventure, here's the transcription.

"I have a well-deserved reputation for being something of a gadget freak. And I'm rarely happier than when spending an entire day programming my computer to perform automatically a task that it would otherwise take me ten seconds to do by hand. Time is valuable, and ten seconds worth of it is well worth the investment of a day's happy activity working out a way of saving it.

The bird we came across was called a megapode, and it has a very similar outlook on life. It looks a little like a lean, spritely chicken. Though it has an advantage over chickens that it can fly, if a little heavily, and is therefore better able to escape from dragons, which can only fly in fairy stories--and in some of the nightmares through which I was plagued while trying to sleep on Kimodo.

The important thing is that the megapode has worked out a wonderful labour saving device for itself. The labour it wishes to save is the time-consuming activity of sitting on its nest all day, incubating its eggs, when it could be out and about doing things. I have to say at this point that we didn't actually come across the bird itself, though we thought we glimpsed one scuttling through the undergrowth. We did, however, come across its labour-saving device, which is something that is hard to miss.

It was a conical mound of thickly packed earth and rotting vegetation. About 6 feet high, and 6 feet wide at its base. In fact, it was considerably higher than appeared, because the mound would've been built on a hollow in the ground, which would itself have been about 3 feet deep.

I just spent a cheerful hour of my time writing a program on my computer that would tell me instantly what the volume of the mound was. It's a very neat and sexy program, with lots of popup menus and things, and the advantage of doing it the way I have is that in on any future occasion on which I need to know the volume of the megapode nest, given its basic dimensions, my computer will tell me the answer in less than a second, which is a wonderful saving of time! The downside, I suppose, is that I cannot conceive of any future occasion that I am likely to need to know the volume of a megapode nest...but, no matter! The volume of this mound is a little over 9 cubic yards.

What the mound is, is an automatic incubator. The heat generated by the chemical reactions of the rotting vegetation keeps the eggs that are buried deep inside it warm. And not merely warm! By judicious additions or subtractions of material to the mound, the megapode is able to keep it at the precise temperature which the eggs require in order to incubate properly.

So, all the megapode has to do to incubate its eggs is merely to dig 3 cubic yards of earth off the ground, fill it with 3 cubic yards of rotting vegetation, collect a further 6 cubic yards of vegetation, build it into a mound, and then, continually monitor the heat it's producing and run about adding bits or taking bits away. And thus, it saves itself all the bother, of sitting on its eggs from time to time.

This cheered me up immensely."

The next time you're tasked with a big, hairy ball of mud on your task list, ask yourself..."Am I being a Megapode?"

Friday, July 13, 2012

Why is TDD awesome?

Because it gets you solving problems.

I've been reading a lot of papers and shifted my attentions over to self-adaptive systems since the start of the year. I've become familiar with different researcher's thoughts on the subjects, and the methodologies and tools they're trying to develop. They seem too disconnected from practice and abstract to be practical. So I'm going to build my own.

That's been the plan. I've been doing a lot of writing, for my dissertation, on how I'm going to build it. How it's going to be different. Why it's useful and will be a game changer, etc etc.

Then I did a random Google Search today:
http://avalanche123.com/blog/2012/06/13/how-i-built-a-self-adaptive-system/

Son of a bitch. I've been looking at this problem since January. I want to make it the brunt of my research focus. Earlier this year I collaborated with Fractal Innovations LLC to design a prototype...but I hadn't gotten a concrete deliverable out of it. Now Some Dude seems to be moving in the same direction.

OH NOES.

The dreaded graduate student nightmare: Invest a whole bunch of time and energy into a problem. Meet with your committee.

"Are you sure this is a problem? It looks like this problem is already solved..."

Not gonna let that happen.

So I read his post, ended up looking at Fuzzy Sets...and it looked kind of intimidating.

A Fuzzy Logic based Expert System?
The seems fine.

Building a framework to build Fuzzy Logic Based Expert Systems, rapidly, conveniently, and following best practice software engineering principles?
Seems hard, bro.

...What can I do?

But I decided to play with the problem. See what I could come up with. I tried to build a system that could process the first, seemingly simple statement:

IF brake temperature IS warm AND speed IS not very fast
THEN brake pressure IS slightly decreased.

Being a framework designer at heart, my mind immediately tried to be a True Computer Scientist.

The True Computer Scientist

You see, the True Computer Scientist's brain is much, much bigger than yours or mine.

The True Computer Scientist is able to see the generalizations and abstractions beyond a given problem.

The True Computer Scientist doesn't worry about piddling, puny, simple logic flows about brake pressure and brake temperatures.

The True Computer Scientist analyzes the structure of the statements, derives a general form that applies in almost any context, and builds The System that manifests the heart and soul of complex decision making in an uncertain world.

The True Computer Scientist solves problems at a level so meta that you're thinking of thinking to understand but you're still not quite there yet.

The True Computer Scientist uses recursion and abstract data types with strongly checked invariants to determine optimal solutions, seamlessly wielding matrix mathematics and monads while you're still trying to figure out what the fuck that means.

The True Computer Scientist is a God, Puny Mortal.
...and I am just a dumb chimp.

Do a thousand monkeys put on computers create Google?

But every time my mind comes upon a problem like this, it immediately tries to be the True Computer Scientist. My mind raced to develop a class for Rules, Antecedents,and Consequents. I thought about building an expression engine for evaluating nested rules...and the more I tried to solve these difficult problems analytically, in my head ahead of time,the more scared I became.

Woah. This is A Big Problem. I don't think I can do this.

"Slow down there, stupid monkey. Solve this problem. The framework can come later."

Just like that, suddenly I moved from Saving the World and Getting the Girl to Make It to the Next Save Point. Suddenly that fear that I'd felt was subdued for a moment. I started solving the problem.

An hour later, I had a working prototype that solved that problem. Plus, along the way, I'd derived the abstractions I need to build the framework.

How?

Simple. I've studied a lot of software design and architecture. Now, I'm able to think in terms of patterns and principles.Which means that, as I'm making my bar turn green, I also say

"Ah, this is something that's likely to change. Instead of adding another property to this class, I'll add a Value Object that this Entity uses."

I write the domain logic in. When I see patterns in what I'm doing, I immediately refactor to a Strategy or extract interfaces. I end up representing First Class Entities like Sensors, just in trying to keep things DRY. I flow from

getting a temperature reading from a Speedometer --> having an Analyst analyze a Sensor using a Strategy--> to having an Agent act on a Perception using a Strategy-->to having an Agent perform an Action using an Actuator with input from a Perceptor.

Pretty Cool

TDD enables framework design. Rather than trying to come up with the optimal design up front, design emerges from the components involved. Even better, it lets you start solving small problems quickly, adding value as you scale to the intended result.

I know that I've become a much better software developer as a result of focusing my technique and being willing to try a different workflow. I regularly meet programmers who are dubious of changing their workflow, and many times default to denouncing innovations in our space as "another Silver Bullet." I highly encourage them to give this material a try.

It also gets better the more you involve and integrate. Reading Evans' Domain Driven Design or Fowler's Patterns of Enterprise Application Architecture are fine, in and of themselves...but when you start building software integrating those concepts, and adding elements such as Kent Beck's Test Driven Development: By Example and Robert Martin's Clean Code, your very coding style and philosophy is tremendously improved. The whole is greater than the sum of the parts.

I'll be illuminating just how much perspectives can change in upcoming posts:

The Android SDK: Yet Another Successful Big Ball of Mud
HTML5 is an Abject and Miserable Failure of Responsibility and Design.

Stay tuned.

Monday, April 16, 2012

TDD and Design Patterns are Just Silver Bullets, *IX doesn't need 'em! Also, Bronze > Iron

Hey List,
I'm working with a simulations company that wants to analyze, document, and convert some VB Legacy code to C#. The documentation process for the original software is already well underway. (I've been working on it for the past week and some.) At first I was under the impression that they wanted me to update the original VB 6 code to VB 10 - for readability and to optimize the current version of the software - then convert everything to C# for the new platform. However, after analyzing the modules I've been given so far (documenting most of them) the software developers and I have come to varying conclusions regarding the quick and effective completion of both the documentation and conversion processes before our deadline. The developers aren't sure how to go about this on account of the fact that they've only been updating the original source code - never analyzing or modifying it completely. They've recognized the need for a complete overhaul in order to deploy the product on more platforms for some time and have just begun. We seem to be leaning towards analysis, conversion, verification, then documentation. Our deadline for project completion is approximately six months away.
Planning aside, one particular contingency we're encountering is the variation in data [and subsequently, file] outputs during code conversion for the same algorithms. The simulation will not function correctly on the new platform if this persists.
The question I have for those of you with experience in this process is, how can the developers and I go about this in a way that allows us to verify each step without spending too much time on documentation and analysis of the original source code? Any suggestions would be awesome. We're likely to decide within the next 48 hours and go from there.

My response

We seem to be leaning towards analysis, conversion, verification, then documentation. Our deadline for project completion is approximately six months away.
The process you're describing is probably sub-optimal. Chances are high you're going to use a Tiger Team to re-architect a Stovepipe System. This is fairly common in green field development, and chances are after about 8 months it'll look as hairy as the current system, just with newer syntax.
I'd advise a more disciplined approach.
Start writing learning tests for the existing system, using those as documentation and specification baselines for the new system.

Read Working Effectively With Legacy Code and TDD:By Example concurrently.
Try using SpecFlow to write the specs. Use source control and a continuous integration tool like Jenkins to make sure you're not breaking existing builds as you refactor pieces of the system.
Since you're seeking platform independence, at some point you'll need to re-architect the system to not make as many implicit assumptions about the model. You'll want to push framework specific interfaces (file I/O, DBs, web services, etc) into your own internally defined APIs. These will show you the way:
Clean Code and Domain Driven Design Head First Design Patterns
You can probably encapsulate the algorithms with a combination of Strategy and Template Method, using Abstract Factory to put related pieces together.
tl;dr: Don't use the Waterfall, what you're describing. You need to start manifesting the existing behavior of the system in executable code that can serve as the baseline for the new system. This is a well-studied problem. Start reading the literature about how to evolve the Big Ball of Mud, and avoid Stovepipe or Wolf Ticket solutions.

Helpful feedback from a stubborn and argumentative friend:

I know I'm wasting my time and I would much rather have an in person discussion about this sometime but oh well. I might/will come across as rude but that's just the way I am, take no offense, I'm blunt and I appreciate bluntness myself. I apologize in advance.

I think you're a little crazy about your design patterns and software development process philosophy.

Bear with me for a minute. Try to imagine someone like me who has never read any design pattern book or resource actually describing them etc. reading your email (and not just this one but several you've sent to the list in the past . . . this one was just particularly .. . noteworthy)

Even better, imagine someone who has never even heard of design patterns and isn't a software developer/programmer at all. No background in it whatsoever.

Got that state of mind?

"Tiger Team", "Stovepipe System", "Green field development" "Strategy and Template System", "Abstract Factory" "Big ball of Mud", "Wolf Ticket"?

Are we still talking about coding or is this some sort of weird elementary school word game?

I hear stuff like this and I am agog.

Let me give you a few links of my own that I found in about 2 seconds of googling.

From
"It's certainly worthwhile for every programmer to read Design Patterns at least once, if only to learn the shared vocabulary of common patterns. But I have two specific issues with the book:

Design patterns are a form of complexity. As with all complexity, I'd rather see developers focus on simpler solutions before going straight to a complex recipe of design patterns.
If you find yourself frequently writing a bunch of boilerplate design pattern code to deal with a "recurring design problem", that's not good engineering-- it's a sign that your language is fundamentally broken.

In his presentation "Design Patterns" Aren't, Mark Dominus says the "Design Patterns" solution is to turn the programmer into a fancy macro processor. I don't want to put words in Mark's mouth, but I think he agrees with at least one of my criticisms.

"

I don't always agree with Attwood/coding horror and in fact sometimes I disagree completely but in this case I agree wholeheartedly. (another poast about design patterns that I don't agree on is that I don't think they're missing language features as paul graham thinks

Design Patterns are not a Silver Bullet

I could go on but I'll just add a quote/paraphrase from Bjarne Stroustrup

Inheritance is one of the most overused and misused C++ features.

Since even the design pattern book itself says they're about the arrangement of objects and classes, if OO is not the best thing for a problem design patterns are automatically not directly relevant.

Object Oriented programming (with a capitals OO) is not always the answer and in fact I think the Java extreme everything is an object etc. is actually extremely bad and people try to do the same type of design in C++ and it's terrible.

I'm not saying I've never used (inadvertently and far from the way you would have implemented it) design patterns of some kind. I will say it's probably far less since I am a C/C++ programmer and I tend to work on lower level non-gui/non-user inteface/database things like 3D graphics and low level IO etc. I don't think I've ever used inheritance in my own (non school required) code. I tend to use concrete type classes (ie vector and matrix types with overloaded operators etc.) and composition occasionally. I think it's better to have a few specialized classes than to try to generalize or create an inheritance tree that obfuscates the purpose and bloats the code/makes it slower etc. Also I have no problem with global variables/data, especially in my personal/prototype programs. Don't see any problem in general either as long as the main.cpp file isn't cluttered.

Actually I've yet to use inheritance in code written at work (Mars Space Flight Facility on campus where I worked on a large old C project ;), then Intel where I wrote some C testing code and mostly set up VMs crap and now at Celestech. I admit that the project I'm working on which is Internal R&D and I'm the sole developer already had some inheritance in it because it's a Qt project and Gui programming is one place where even I agree some inheritance is good/necessary . . . but again remember everything in a modern language could be written in C. Actually I think Linus has a point and C is great because it's simple and small and easy to understand/minimize complexity but he goes overboard and C++ is a great language that he just was exposed to before it was standardized/implemented fully.

If I were to argue some design/development methodology in contrast to yours it would be KISS/YAGNI (just learned the YAGNI acronym today but I've always believed it about over design). Also make it work make it right make it fast. where I define "right" and I iterate in an extreme/agile fashion more or less if I need to.

Finally, this:

" We seem to be leaning towards analysis, conversion, verification, then documentation. Our deadline for project completion is approximately six months away. "

You say this is suboptimal and waterfally. I couldn't disagree more. It may use waterfally sounding terms but what they've described is the most basic/straightforward/easy way to port/rewrite code.

read and understand it, rewrite it on the new platform/language, make sure it gets the same output and then document it. How is that not the most fundamental way to do it?

Also nowhere in the email does it say the original code is bad (besides the inherent badness of VB maybe) or badly designed etc.. Legacy doesn't necessarily mean bad/evil/badly designed code. The linux kernel is 20 years old and parts of it have hardly changed or haven't at all since the beginning. There are other examples.

the process you described, in my opinion, adds a ton of unnecessary work and complication to a simple process. The only thing you've mentioned that I would agree with for some projects is Jenkins/Hudson. But that is only useful for large, ongoing projects, not something like this, a simple rewrite/port.

Waterfall is CSE 360 where you waste over half the semester designing and documenting and creating sequence diagrams and UML diagrams and crap before you've written a line of code so they're all completely made up and useless..

Anyway, again I apologize and don't take offense. I know you're a good developer just very different with a very different coding experience.

To deconstruct every one of the arguments would be long and tedious, as it basically boils down to

I don't like your heavy use of professional jargon/terminology.

This is fair. I probably overdid it a bit, especially given the audience of the list. I was trying to use a shared lexicon to keep the transmission short, as I'm typically accused of writing too much. But in my haste to transmit a short message and provide directed pointers, I came off pedantic and "overzealous."

Because I don't understand the terminology, you must be wrong

Obviously he hadn't bothered to look up Legacy Code (equating age with Legacy is imprecise), or "Template Method" as opposed to "Template System", etc etc. The dangers of trying to rely on a shared lexicon that's not ubiquitous rears its ugly head. But because he'd written code at a university, Intel, and other places, and it had worked, he must know a better way of writing code. "Because it works, it is correct and good."

We can build cars out of matchsticks and make them run. Does that mean all cars should be made of matchsticks?

We can build bridges with thin metal plates. Does that mean we should build all bridges that way?

OOP is not that great. I write C code every day, it's more performant and better and awesome.

C is not that great. You don't need function pointers and loops. You'd be much more performant in assembly.

Anything that can be written in a higher level language can be written in assembly. So go! Load words into registers. Go twiddle your bits because you'll be faster and your code will have simpler pieces. After all, who the hell understands pointers? Just load memory addresses into registers!

While the GoF book and much of the attendant literature has revolved around OO, it's naive and foolish to discard Patterns as an OO concept. Functional Programming is making great strides in uncovering Monads, such as the State Monad and File I/O, that serve similar purposes: the solution to a problem in context. Agent-Oriented artificial intelligence systems[this][that] are drawing great inspiration from patterns. All of science is converging to the idea that patterns are fundamental. Analysis Patterns, Implementation Patterns, Compiler Patterns, all of these are applicable whether the program is written in an OO, procedural, or functional style. Evans talks about this in Domain Driven Design as well.

C++ was perfectly fine with function pointers and multiple inheritance. Why did C++0x add lambdas? Because better tools for the job allow us to do our job more efficiently. That's the only reason to ever add to a language, because the increased expressivity and conciseness improve clarity. Rejecting the use of a shared vocabulary because it "adds cognitive overhead" is dubious.

Linux doesn't need all this junk. It's simple and awesome.

Curious, I asked a friend about an anecdote he'd shared with me.

I recently got into an interesting argument about quality code and its manifestations in various paradigms, with a guy who essentially argued that Design Patterns and TDD are too complex, the simplest thing is to rewrite a system from scratch and document it thoroughly, and that Linux was the bee's knees in quality and proof you didn't need things like tests to write good code. I'd like to reference your wireless card story and the post you found by Linus Torvalds. Could you try finding that again and posting it on my wall?

Found it, I'll post it. I'd argue that Linux is proof of a different concept, which gives it an advantage over how TDD is used in the real world: "with enough eyes, all bugs look shallow." Linux is remarkably stable without automatic tests, but this is because of the massive number of people who run test releases and the huge number of people looking at bug reports. With that user and dev base, any problem that's found will probably have an obvious solution to somebody. Tests would still help from time to time, but Linux can squeak by without them in ways that software developed by a small team never could. One could ask, which is more complicated: design patterns and TDD, or building a dev base hundreds of thousands strong and a user base millions strong that's willing to accept major bugs?

I really think this marvelous piece from his post says it all.

Linus Torvalds | 12 Jan 06:20

Re: brcm80211 breakage..

On Wed, Jan 11, 2012 at 8:15 PM, Larry Finger lwfinger.net> wrote:
>
> I see no difference in the core revisions, etc. to explain why mine should
> work, and yours fail.

Maybe your BIOS firmware sets things up, and the Apple Macbook Air
doesn't? And the driver used to initialize things sufficiently, and
the changes have broken that?

Apple is famous for being contrary. They tend to wire things up oddly,
they don't initialize things in the BIOS (they don't have a BIOS at
all, they use EFI, but even there they use their own abortion of an
EFI rather than what everybody else does), yadda yadda.

But the real point is: it used to work, and now it doesn't. This needs
to get fixed, or it will get reverted.

Linus

I'm not trying to troll Linux or be a jackass, but every time a neckbeard tells me how awesome Linux is and it doesn't need all these "Silver Bullets", I wanna refer them back to this thread. If the patch committer had had an automated regression suite that he could've run before committing, and specific pieces like the MacBook Air wireless driver initialization had been mocked out such that it could've red barred, then the Inventor of Linux's day wouldn't have been ruined on something like this when he OBVIOUSLY has better things to do.

This speaks to a larger point. A sociological one. I replied to my friend as such.

I'd argue that Linux is a byproduct of its time.

Unix could've been written in LISP, but the hardware wasn't performant enough yet at the time to handle the case. In the same way that we could build our cars out of carbon nanotubes, except it's too expensive. Different technologies are appropriate at different times. Iron probably existed in the Bronze Age, but it was too expensive and not readily available enough to build weapons, armor, and pottery out of.

So some very clever hackers used the best tools they had at the time, building languages off of FORTRAN like B and C, and made tools out of them. These tools got agglomerated, and eventually became an operating system. It works, and it's good. But "it works" does not necessarily imply optimal.

Linux was built off of Unix because the most popular OSes when Torvalds wrote it were Windows and Macintosh. Which had provable security and stability issues. Unix had a vibrant but small use base in academia, military, etc...Torvalds made it more available for commercial use.

Social attitudes change with time. Manifest Destiny was a popular American mantra in the 1800s. In modern times, were new land to be found and attempted to be colonized even though indigenous tribes lived on it, it is unlikely that the idea of "enlighten the Noble Savage" or "drive the wild animals off the land" would be as socially acceptable.

Ways of doing things change, as well. Henry Ford went very far with the assembly line. The American car companies were very happy with their production model all through WWII, the 1960s, the 1970s...and their lunch got eaten in the 1980s/90s/00s. The Japanese improved on the process models with Lean Manufacturing, Six Sigma, and applying effort to Eliminate Waste and improve efficiency.

One can argue that TDD is not new. REPLs have been around for decades. But there's a difference between writing a program by typing a little bit in REPL until your code works, then throwing your micro-experiments/ code doodles away and using the working code as the finished product, versus storing those "doodlings" as executable tests that assert behavior--which can act as specifications, check boundary conditions, and even influence the design process.

A program, at the end, is just the solved equation. The regression test suite and commit log history shows your work. It stores the Tribal Memory that went into your code as code. It's really hard to overstate how important that is.

Why does your teacher at school not accept assignments that don't show their work? Because the process for arriving at a solution is as important as the solution! We're not talking about "daily standups" and "ScrumMasters" or anything here. We're talking about the tools you used for the job, and the way you applied those tools to solve the problem.

Design Patterns have always existed. Identification and codification of them is merely seeking to create a language to express complex ideas more succinctly. A lot of people believe they're "unnecessary complexity", and make arguments to keep it simple. Ask them:

Do they believe in loops? You don't need a loop structure. You could just write the same code 10 times.

Do they believe in methods? You don't need methods. You could just write it all in int main().

Do they believe in classes? You don't need classes. You could just write it all in one file.

Do they believe in modules? You don't need modules. You could just put all your classes in one place.

Do they believe in inheritance and polymorphism? You don't need those words. You could just say "you write a class that extends another class, and it gets all of the properties and methods of the class it extends." or "you can treat multiple different classes as if they're the same."

So if they believe in all of these things, what's so hard about saying Template Method as opposed to "I write a class where I can define the outline of algorithm and replace specific operations dynamically by extending it and proving a new implementation?"

Your point is well taken. Linux "works" because it has a wide user base and leverages Amazon's Mechanical Turk model or Google PigeonRank.

But is working "good enough?" Or does the community actively accept stagnation if it doesn't seek to incorporate new process models into its ecosystem? It didn't work out that well for American auto. Why should it work well for Linux?

People who decry the advancement of programming knowledge as "old wine in new bottles" and Silver Bullets don't understand that old systems are like Zombies that eat away at our brainpower.

I had a physics research faculty member recently tell me that he had PhD students spend 6 months trying to add a new feature to a piece of code for his dissertation. The student ultimately failed. The feature was ultimately augmenting a method in a deep, nested inheritance chain. We can write code better.

I've heard similar stories that often researchers write papers on works that are published in leading scientific journals based on code simulations that crash 30% of the time. The published results that are advancing the frontiers of science could be repeatable or they could be a random bug.

Calls for utilization of 5 billion cores by the federal government for sequencing genes to do things like fight cancer often have hundreds of hours allocated to programs that may crash half-way through with array indexing issues, may thrash their way through their alloted time.

But maybe it's not a problem. Maybe this really is self-evident and we don't need such "Silver Bullets."

Is it clear to you that PermutationEstimator.cpp's evaluate method uses an array to represent the traversal of a Graph data structure, where each node is the array index and the next node to go to is the value? That evaluate() is using the Hungarian Algorithm to estimate the lowest cost permutation?

But it is good work. After all, it does work. Is that really how we wanna leave it working?

68% of all software projects fail. Treating modern ways of writing software that seek to make it clearer to read and understand, tease apart dependencies, capture its behavior through automated and repeatable methods, and to overall evolve not just our tools but our very mindsets as "Silver Bullets" is to accept that such is good enough, and should continue. Design Patterns are most certainly not Silver Bullets. They shouldn't be regarded as the answer to every problem. That just leads to Golden Hammer. ;P

Wednesday, March 7, 2012

Big Bugs, small bugs

It's...(>_>)...(-_-)...not the small bugs and mistakes that we should be so nervous about finding. It's the big ones.

Consequently, when a problem is found the response should not be "PANIC/BLAME, THERE'S A PROBLEM" but "Hmph. Problem. How long will it take to fix? What does the fix entail? What was the cause of the issue? How can we learn to minimize the recurrence of this problem in the future? How can we at least minimize the impact of its issue?"

...at least for the small problems. When a big problem occurs that's a compound sum of small problems where nobody bothered to ask those questions, I think it's perfectly logical to hold people accountable. Massive failure deserves its recompense.

I work in software teams where the managers become very uncomfortable with the idea of "bugs" being found in their code. I am too. I want my code to be "bug-free." Clean, beautiful, elegant. A work of art on par with the Mona Lisa and the perfect features of Angelina Jolie/Jessica Alba in her time...

But that's really hard, with software. It solves big, complex problems. The bigger and more complex problem you're dealing with, the harder it is to focus on doing the little things perfectly. There are many an abstract algebrician who may forget a sign or two as they're walking through a derivation, many a writer who may forget to use perfect punctuation as they're trying to espouse a greater point.

On one hand, I understand the frustration with this. As a student who's trying to learn calculus, you don't know enough about the material to tell the professor when he's wrong. You're just trying to keep up, and he should've done the little things right in the first place! As a reader of content, I judge a writer who writes things like "it was in there best interests to do that" or "In you're face!" If a writer is as intelligent as her or she thinks her or she is, he or she should've written the content of their piece correctly, in all it's glory, in the first place!

...and yes, that "it's" was intentional for humorous effect, in case your wondering. As was that "your". Gotcha. :P

On the other hand, the "it should've been done right!" argument can only go so far. Humans are not perfect. We will make mistakes, out of sloth, oversight, or just plain ignorance. The idea that we should always "dot our i's and cross our ts" is a good ideal, but difficult to do in practice.

Perhaps it's just harder for me. Some people might call me "sloppy." I argue there's a depth/breadth trade-off. The more varied or broad your concerns, the less time and energy you spend on the execution details of each of those concerns individually. Plus, the more things you try to do, the more opportunity there is for something to go wrong.

Some people may disagree, but this seems to mimic the realities of life.

Thursday, March 1, 2012

Inheritance and Composition: There and back again

I recently got an application request to provide a web interface for a mobile app. The would serve as a demonstration for the mobile app's features. Both apps connect to a cloud datastore that provides a REST API for content.

I've gotten to the point that, whenever I start a new app, I try to think pretty heavily about what could change. I could've hacked and slashed through it, but I wanted to take a few moments to follow the basic Gang of Four design principles.

Encapsulate what varies

Program to an interface, not an implementation

As cloud based services are still new, and a relatively unstable market, my immediate thought was:what if we switch from one to another? Should the entire app need to be re-written? I say No.

Consequently, I wanted to encapsulate the variation of datastore. Ideally, my Domain model could remain as free as possible from specific provider concerns. It should focus on just the application function interactions.

I decided to build the app quickly in a framework that I'm comfortable with and enjoy, Grails. Grails boasts an advanced application architecture that really takes the concepts of Fowler's Patterns of Enterprise Application Architecture such as a Service Layer (something Rails does not seem to have out of the box, though it supports something kind of similar in "Helpers"), Data Mapper, Domain Model, Template View, Front Controller, and many other pieces that enable a competent, enterprise level senior developer to comfortably focus on building a god application from solid software engineering principles without having to build everything from scratch.

My initial approach was to handle this encapsulation of concepts in the Service Layer. The controller should only know about a Facade DatastoreServicewhich internally delegates to the application provider API.

Seeking to follow the concepts of Clean Boundaries from Uncle Bob Martin's Clean Code , I defined my own application level interface for the datastore. I then define a subinterface that extends that interface for the specific application provider, augmenting it with methods to encapsulate the REST API calls. An abstract class implements invariants of that interface and provides an extension point for the implementing service.

At this point,I'm starting to get worried. I seem to be adding a lot of complexity to the application for a very basic desire. Especially when further implementation of functionality made me need to store the 3rd party library's unique indentifier for my subsequent querying. As I tested, built, and refactored, I found my application as internally at odds about how to do this as I was.

On one hand I've used an inheritance hierarchy to store the 3rd party API's object id in my Domain Model. On the other, I have such a slim Facade over the 3rd party API in the service layer that it's just noise. I have two options:

Remove the Datastore service and have the DemoController directly talk to the ParseService.

Refactor the Domain Model to tease apart the 3rd party API id.

At first, I wasn't comfortable with the concept of option 2. It seemed conceptually strange. A Zapper IS A ParseEntity, and so is a ZapCard. Until I realized that what I'm saying, within the concept of my application, is that Zapper HAS A ParseIdentity. In effect, I'm saying that my Domain Model exists outside the context of the datastore,but has a representation within it.

This is conceptually interesting. We are used to thinking about the world in terms of how things are. I am a Person. I am also a Student. All Students are Persons, hence a Student IS A Person. But I'm not just a Student. I'm also an Entrepreneur, Engineer, Activist...How can I reconcile all of this without multiple inheritance?

Instead, what if I am a Person, who HAS A Student identity, HAS A Engineeridentity, etc etc. Then I can have a single canonical representation, and assume different roles in different contexts. When I have to activate pieces from a different context (such as drawing upon my Student.study() within an Engineer context, this process is simplified. Instead of having to cast myself into a different role, my fundamental representation is invariant (Person) and calls for identity are delegated to the appropriate concept. Or, to flip this in reverse, when I am a Student I need to access information that I have as a Person, I access it from my internal Person representation. Like a Russian Egg. Hence,prototypal inheritance built from composition.

Wednesday, February 29, 2012

How many tests would you write for this?

Working through Michael Feathers' Working Effectively With Legacy Code, I find myself wondering how many tests I should write for this.

Naively, I only have a public method. I only need to write tests to validate that method's externally facing API, right?

That feels wrong, though. The private method is doing "interesting things." It's connecting to a web API to update a local cache of data. If the data that's expected is not there after the update,it's telling the API to create that data. Although I've split up work by the methods to provide an easy API for the client view, because the public method internally has different conditions that it performs differently under, I should have tests that stress those different paths:

When both user and card exist in local cache (already tested)

When a user exists in local cache, but no card

When a card exists in local cache, but no user

When neither exists in local cache

Since I'm only testing the controller, I should mock the datastore service in each case. Simple enough, though it seems rather surprising that such a seemingly simple method should have so many execution paths. I suspect there is a refactoring hidden here that makes things a bit cleaner. But I can't find it right now.

As I work through such scenarios, I'm starting to see why Uncle Bob Martin says Objects Should Do One Thing in Clean Code. It's easier to test small objects that do one thing with simple tests: did it do it? Did something within an expected set of "untowards" happen?

I'm also starting to see how judicious use of objects starts eliminating branching statements (using an Abstract Factory instead of a giant switch/case for instantiating objects, for example.) As logic is pushed out to self-contained objects that encapsulate it, dynamically configurable behavior (via run-time Dependency Injection) allows for piecing out the number of things that happen in one place and putting them under test in another. The amount of unit tests for individual components becomes small, and all you need is a few integration/functional tests to verify everything knits together correctly at runtime.

It also makes me wonder about the necessity of visibility modifiers(public, private, protected). Essentially, at small enough levels of work, those things perhaps become unnecessary. If private methods should usually be refactored into distinct objects that encapsulate a small piece of work and expose a public API (e.g. DemoController's private populateFromRemoteData() should be the PopulationService's public populate() method, then it almost becomes unnecessary to have visibiity modification.

Of course, dynamic programming languages have worked under this assumption forever. What if LISP had it right all along? The key is, one has to change one's very style of writing code to truly see this benefits. If one writes giant globs of functions (10+ lines, doing many things) in LISP, you're in the same place as you are with OOP. In which case, you need to have the ability to hide certain things, ignore certain things, just for overall system comprehension.

After all,one of the original purposes of David Parnas' treaty on Information Hiding was that it made software easier to understand, in the design sense. Encapsulation enables reasoning about a system because it abstracts away irrelevant details. Modules are designed to contain cohesive pieces of functionality that expose an interface, in the original Parnas paper.

One can argue that visibility modifiers mean that the imperative style took the wrong approach from his observation. One can argue that classes act as "mini-modules" when they have private methods that are internal and public methods that expose an external interface. Instead, the private methods should be public methods of separate sub components within the module.

This doesn't take into account the "access control" piece of the puzzle, the idea that there's certain invariant information that you don't want clients replacing at runtime because it breaks your functionality. Yet the use of reflection and metaprogramming is only increasing because of the way they dramatically simplify programming. These tools typically blow right past access control. For example, Groovy secretly doesn't respect private. It's a convention that programmers don't use metaprogramming to modify privates.

Perhaps that really is a case of "you break it, you bought it?" But then what are the implications for software security models?

Software Meditations