Sunday, August 12, 2012
The Parable of Little Jimmy
Meet Jimmy
Jimmy Goes to War
Save Jimmy!
No Problem…right?
Not So Fast…
The End of Jimmy
Who’s to Blame?
No
How It Might Have Been
But It’s Gotta Get Done!
Bottom Line
Software Architecture and the Parable of the Megapode
This is a story best told in Douglas Adams' delightful voice.
But, if you don't have the time or motivation to listen to a few minutes of adventure, here's the transcription.
"I have a well-deserved reputation for being something of a gadget freak. And I'm rarely happier than when spending an entire day programming my computer to perform automatically a task that it would otherwise take me ten seconds to do by hand. Time is valuable, and ten seconds worth of it is well worth the investment of a day's happy activity working out a way of saving it.
The bird we came across was called a megapode, and it has a very similar outlook on life. It looks a little like a lean, spritely chicken. Though it has an advantage over chickens that it can fly, if a little heavily, and is therefore better able to escape from dragons, which can only fly in fairy stories--and in some of the nightmares through which I was plagued while trying to sleep on Kimodo.
The important thing is that the megapode has worked out a wonderful labour saving device for itself. The labour it wishes to save is the time-consuming activity of sitting on its nest all day, incubating its eggs, when it could be out and about doing things. I have to say at this point that we didn't actually come across the bird itself, though we thought we glimpsed one scuttling through the undergrowth. We did, however, come across its labour-saving device, which is something that is hard to miss.
It was a conical mound of thickly packed earth and rotting vegetation. About 6 feet high, and 6 feet wide at its base. In fact, it was considerably higher than appeared, because the mound would've been built on a hollow in the ground, which would itself have been about 3 feet deep.
I just spent a cheerful hour of my time writing a program on my computer that would tell me instantly what the volume of the mound was. It's a very neat and sexy program, with lots of popup menus and things, and the advantage of doing it the way I have is that in on any future occasion on which I need to know the volume of the megapode nest, given its basic dimensions, my computer will tell me the answer in less than a second, which is a wonderful saving of time! The downside, I suppose, is that I cannot conceive of any future occasion that I am likely to need to know the volume of a megapode nest...but, no matter! The volume of this mound is a little over 9 cubic yards.
What the mound is, is an automatic incubator. The heat generated by the chemical reactions of the rotting vegetation keeps the eggs that are buried deep inside it warm. And not merely warm! By judicious additions or subtractions of material to the mound, the megapode is able to keep it at the precise temperature which the eggs require in order to incubate properly.
So, all the megapode has to do to incubate its eggs is merely to dig 3 cubic yards of earth off the ground, fill it with 3 cubic yards of rotting vegetation, collect a further 6 cubic yards of vegetation, build it into a mound, and then, continually monitor the heat it's producing and run about adding bits or taking bits away. And thus, it saves itself all the bother, of sitting on its eggs from time to time.
This cheered me up immensely."
The next time you're tasked with a big, hairy ball of mud on your task list, ask yourself..."Am I being a Megapode?"
Friday, July 13, 2012
Why is TDD awesome?
Because it gets you solving problems.
I've been reading a lot of papers and shifted my attentions over to self-adaptive systems since the start of the year. I've become familiar with different researcher's thoughts on the subjects, and the methodologies and tools they're trying to develop. They seem too disconnected from practice and abstract to be practical. So I'm going to build my own.
That's been the plan. I've been doing a lot of writing, for my dissertation, on how I'm going to build it. How it's going to be different. Why it's useful and will be a game changer, etc etc.
Then I did a random Google Search today:
http://avalanche123.com/blog/2012/06/13/how-i-built-a-self-adaptive-system/
Son of a bitch. I've been looking at this problem since January. I want to make it the brunt of my research focus. Earlier this year I collaborated with Fractal Innovations LLC to design a prototype...but I hadn't gotten a concrete deliverable out of it. Now Some Dude seems to be moving in the same direction.
OH NOES.
The dreaded graduate student nightmare: Invest a whole bunch of time and energy into a problem. Meet with your committee.
"Are you sure this is a problem? It looks like this problem is already solved..."
Not gonna let that happen.
So I read his post, ended up looking at Fuzzy Sets...and it looked kind of intimidating.
A Fuzzy Logic based Expert System?
The seems fine.
Building a framework to build Fuzzy Logic Based Expert Systems, rapidly, conveniently, and following best practice software engineering principles?
Seems hard, bro.
...What can I do?
But I decided to play with the problem. See what I could come up with. I tried to build a system that could process the first, seemingly simple statement:IF brake temperature IS warm AND speed IS not very fast
THEN brake pressure IS slightly decreased.
Being a framework designer at heart, my mind immediately tried to be a True Computer Scientist.
The True Computer Scientist
You see, the True Computer Scientist's brain is much, much bigger than yours or mine.The True Computer Scientist is able to see the generalizations and abstractions beyond a given problem.
The True Computer Scientist doesn't worry about piddling, puny, simple logic flows about brake pressure and brake temperatures.
The True Computer Scientist analyzes the structure of the statements, derives a general form that applies in almost any context, and builds The System that manifests the heart and soul of complex decision making in an uncertain world.
The True Computer Scientist solves problems at a level so meta that you're thinking of thinking to understand but you're still not quite there yet.
The True Computer Scientist uses recursion and abstract data types with strongly checked invariants to determine optimal solutions, seamlessly wielding matrix mathematics and monads while you're still trying to figure out what the fuck that means.
The True Computer Scientist is a God, Puny Mortal.
...and I am just a dumb chimp.
Do a thousand monkeys put on computers create Google?
But every time my mind comes upon a problem like this, it immediately tries to be the True Computer Scientist. My mind raced to develop a class for Rules, Antecedents,and Consequents. I thought about building an expression engine for evaluating nested rules...and the more I tried to solve these difficult problems analytically, in my head ahead of time,the more scared I became.Woah. This is A Big Problem. I don't think I can do this.
"Slow down there, stupid monkey. Solve this problem. The framework can come later."
Just like that, suddenly I moved from Saving the World and Getting the Girl to Make It to the Next Save Point. Suddenly that fear that I'd felt was subdued for a moment. I started solving the problem.
An hour later, I had a working prototype that solved that problem. Plus, along the way, I'd derived the abstractions I need to build the framework.
How?
Simple. I've studied a lot of software design and architecture. Now, I'm able to think in terms of patterns and principles.Which means that, as I'm making my bar turn green, I also say
"Ah, this is something that's likely to change. Instead of adding another property to this class, I'll add a Value Object that this Entity uses."
I write the domain logic in. When I see patterns in what I'm doing, I immediately refactor to a Strategy or extract interfaces. I end up representing First Class Entities like Sensors, just in trying to keep things DRY. I flow from
getting a temperature reading from a Speedometer --> having an Analyst analyze a Sensor using a Strategy--> to having an Agent act on a Perception using a Strategy-->to having an Agent perform an Action using an Actuator with input from a Perceptor.
Pretty Cool
TDD enables framework design. Rather than trying to come up with the optimal design up front, design emerges from the components involved. Even better, it lets you start solving small problems quickly, adding value as you scale to the intended result.I know that I've become a much better software developer as a result of focusing my technique and being willing to try a different workflow. I regularly meet programmers who are dubious of changing their workflow, and many times default to denouncing innovations in our space as "another Silver Bullet." I highly encourage them to give this material a try.
It also gets better the more you involve and integrate. Reading Evans' Domain Driven Design or Fowler's Patterns of Enterprise Application Architecture are fine, in and of themselves...but when you start building software integrating those concepts, and adding elements such as Kent Beck's Test Driven Development: By Example and Robert Martin's Clean Code, your very coding style and philosophy is tremendously improved. The whole is greater than the sum of the parts.
I'll be illuminating just how much perspectives can change in upcoming posts:
The Android SDK: Yet Another Successful Big Ball of Mud
HTML5 is an Abject and Miserable Failure of Responsibility and Design.
Stay tuned.
Monday, April 16, 2012
TDD and Design Patterns are Just Silver Bullets, *IX doesn't need 'em! Also, Bronze > Iron
Hey List,
I'm working with a simulations company that wants to analyze, document, and convert some VB Legacy code to C#. The documentation process for the original software is already well underway. (I've been working on it for the past week and some.) At first I was under the impression that they wanted me to update the original VB 6 code to VB 10 - for readability and to optimize the current version of the software - then convert everything to C# for the new platform. However, after analyzing the modules I've been given so far (documenting most of them) the software developers and I have come to varying conclusions regarding the quick and effective completion of both the documentation and conversion processes before our deadline. The developers aren't sure how to go about this on account of the fact that they've only been updating the original source code - never analyzing or modifying it completely. They've recognized the need for a complete overhaul in order to deploy the product on more platforms for some time and have just begun. We seem to be leaning towards analysis, conversion, verification, then documentation. Our deadline for project completion is approximately six months away.
Planning aside, one particular contingency we're encountering is the variation in data [and subsequently, file] outputs during code conversion for the same algorithms. The simulation will not function correctly on the new platform if this persists.
The question I have for those of you with experience in this process is, how can the developers and I go about this in a way that allows us to verify each step without spending too much time on documentation and analysis of the original source code? Any suggestions would be awesome. We're likely to decide within the next 48 hours and go from there.
My response
We seem to be leaning towards analysis, conversion, verification, then documentation. Our deadline for project completion is approximately six months away.The process you're describing is probably sub-optimal. Chances are high you're going to use a Tiger Team to re-architect a Stovepipe System. This is fairly common in green field development, and chances are after about 8 months it'll look as hairy as the current system, just with newer syntax.
I'd advise a more disciplined approach.
Start writing learning tests for the existing system, using those as documentation and specification baselines for the new system.
Read Working Effectively With Legacy Code and TDD:By Example concurrently.Try using SpecFlow to write the specs. Use source control and a continuous integration tool like Jenkins to make sure you're not breaking existing builds as you refactor pieces of the system.
Since you're seeking platform independence, at some point you'll need to re-architect the system to not make as many implicit assumptions about the model. You'll want to push framework specific interfaces (file I/O, DBs, web services, etc) into your own internally defined APIs. These will show you the way:
Clean Code and Domain Driven Design Head First Design Patterns
You can probably encapsulate the algorithms with a combination of Strategy and Template Method, using Abstract Factory to put related pieces together.
tl;dr: Don't use the Waterfall, what you're describing. You need to start manifesting the existing behavior of the system in executable code that can serve as the baseline for the new system. This is a well-studied problem. Start reading the literature about how to evolve the Big Ball of Mud, and avoid Stovepipe or Wolf Ticket solutions.
Helpful feedback from a stubborn and argumentative friend:
I know I'm wasting my time and I would much rather have an in person discussion about this sometime but oh well. I might/will come across as rude but that's just the way I am, take no offense, I'm blunt and I appreciate bluntness myself. I apologize in advance.
I think you're a little crazy about your design patterns and software development process philosophy.
Bear with me for a minute. Try to imagine someone like me who has never read any design pattern book or resource actually describing them etc. reading your email (and not just this one but several you've sent to the list in the past . . . this one was just particularly .. . noteworthy)
Even better, imagine someone who has never even heard of design patterns and isn't a software developer/programmer at all. No background in it whatsoever.
Got that state of mind?
"Tiger Team", "Stovepipe System", "Green field development" "Strategy and Template System", "Abstract Factory" "Big ball of Mud", "Wolf Ticket"?
Are we still talking about coding or is this some sort of weird elementary school word game?
I hear stuff like this and I am agog.
Let me give you a few links of my own that I found in about 2 seconds of googling.
From
"It's certainly worthwhile for every programmer to read Design Patterns at least once, if only to learn the shared vocabulary of common patterns. But I have two specific issues with the book:
Design patterns are a form of complexity. As with all complexity, I'd rather see developers focus on simpler solutions before going straight to a complex recipe of design patterns.
If you find yourself frequently writing a bunch of boilerplate design pattern code to deal with a "recurring design problem", that's not good engineering-- it's a sign that your language is fundamentally broken.
In his presentation "Design Patterns" Aren't, Mark Dominus says the "Design Patterns" solution is to turn the programmer into a fancy macro processor. I don't want to put words in Mark's mouth, but I think he agrees with at least one of my criticisms.
"
I don't always agree with Attwood/coding horror and in fact sometimes I disagree completely but in this case I agree wholeheartedly. (another poast about design patterns that I don't agree on is that I don't think they're missing language features as paul graham thinks
Design Patterns are not a Silver Bullet
I could go on but I'll just add a quote/paraphrase from Bjarne Stroustrup
Inheritance is one of the most overused and misused C++ features.
Since even the design pattern book itself says they're about the arrangement of objects and classes, if OO is not the best thing for a problem design patterns are automatically not directly relevant.
Object Oriented programming (with a capitals OO) is not always the answer and in fact I think the Java extreme everything is an object etc. is actually extremely bad and people try to do the same type of design in C++ and it's terrible.
I'm not saying I've never used (inadvertently and far from the way you would have implemented it) design patterns of some kind. I will say it's probably far less since I am a C/C++ programmer and I tend to work on lower level non-gui/non-user inteface/database things like 3D graphics and low level IO etc. I don't think I've ever used inheritance in my own (non school required) code. I tend to use concrete type classes (ie vector and matrix types with overloaded operators etc.) and composition occasionally. I think it's better to have a few specialized classes than to try to generalize or create an inheritance tree that obfuscates the purpose and bloats the code/makes it slower etc. Also I have no problem with global variables/data, especially in my personal/prototype programs. Don't see any problem in general either as long as the main.cpp file isn't cluttered.
Actually I've yet to use inheritance in code written at work (Mars Space Flight Facility on campus where I worked on a large old C project ;), then Intel where I wrote some C testing code and mostly set up VMs crap and now at Celestech. I admit that the project I'm working on which is Internal R&D and I'm the sole developer already had some inheritance in it because it's a Qt project and Gui programming is one place where even I agree some inheritance is good/necessary . . . but again remember everything in a modern language could be written in C. Actually I think Linus has a point and C is great because it's simple and small and easy to understand/minimize complexity but he goes overboard and C++ is a great language that he just was exposed to before it was standardized/implemented fully.
If I were to argue some design/development methodology in contrast to yours it would be KISS/YAGNI (just learned the YAGNI acronym today but I've always believed it about over design). Also make it work make it right make it fast. where I define "right" and I iterate in an extreme/agile fashion more or less if I need to.
Finally, this:
" We seem to be leaning towards analysis, conversion, verification, then documentation. Our deadline for project completion is approximately six months away. "
You say this is suboptimal and waterfally. I couldn't disagree more. It may use waterfally sounding terms but what they've described is the most basic/straightforward/easy way to port/rewrite code.
read and understand it, rewrite it on the new platform/language, make sure it gets the same output and then document it. How is that not the most fundamental way to do it?
Also nowhere in the email does it say the original code is bad (besides the inherent badness of VB maybe) or badly designed etc.. Legacy doesn't necessarily mean bad/evil/badly designed code. The linux kernel is 20 years old and parts of it have hardly changed or haven't at all since the beginning. There are other examples.
the process you described, in my opinion, adds a ton of unnecessary work and complication to a simple process. The only thing you've mentioned that I would agree with for some projects is Jenkins/Hudson. But that is only useful for large, ongoing projects, not something like this, a simple rewrite/port.
Waterfall is CSE 360 where you waste over half the semester designing and documenting and creating sequence diagrams and UML diagrams and crap before you've written a line of code so they're all completely made up and useless..
Anyway, again I apologize and don't take offense. I know you're a good developer just very different with a very different coding experience.
To deconstruct every one of the arguments would be long and tedious, as it basically boils down to
- I don't like your heavy use of professional jargon/terminology.
- Because I don't understand the terminology, you must be wrong
We can build cars out of matchsticks and make them run. Does that mean all cars should be made of matchsticks?
We can build bridges with thin metal plates. Does that mean we should build all bridges that way?
- OOP is not that great. I write C code every day, it's more performant and better and awesome.
Anything that can be written in a higher level language can be written in assembly. So go! Load words into registers. Go twiddle your bits because you'll be faster and your code will have simpler pieces. After all, who the hell understands pointers? Just load memory addresses into registers!
While the GoF book and much of the attendant literature has revolved around OO, it's naive and foolish to discard Patterns as an OO concept. Functional Programming is making great strides in uncovering Monads, such as the State Monad and File I/O, that serve similar purposes: the solution to a problem in context. Agent-Oriented artificial intelligence systems[this][that] are drawing great inspiration from patterns. All of science is converging to the idea that patterns are fundamental. Analysis Patterns, Implementation Patterns, Compiler Patterns, all of these are applicable whether the program is written in an OO, procedural, or functional style. Evans talks about this in Domain Driven Design as well.
C++ was perfectly fine with function pointers and multiple inheritance. Why did C++0x add lambdas? Because better tools for the job allow us to do our job more efficiently. That's the only reason to ever add to a language, because the increased expressivity and conciseness improve clarity. Rejecting the use of a shared vocabulary because it "adds cognitive overhead" is dubious.
- Linux doesn't need all this junk. It's simple and awesome.
I recently got into an interesting argument about quality code and its manifestations in various paradigms, with a guy who essentially argued that Design Patterns and TDD are too complex, the simplest thing is to rewrite a system from scratch and document it thoroughly, and that Linux was the bee's knees in quality and proof you didn't need things like tests to write good code. I'd like to reference your wireless card story and the post you found by Linus Torvalds. Could you try finding that again and posting it on my wall?Found it, I'll post it. I'd argue that Linux is proof of a different concept, which gives it an advantage over how TDD is used in the real world: "with enough eyes, all bugs look shallow." Linux is remarkably stable without automatic tests, but this is because of the massive number of people who run test releases and the huge number of people looking at bug reports. With that user and dev base, any problem that's found will probably have an obvious solution to somebody. Tests would still help from time to time, but Linux can squeak by without them in ways that software developed by a small team never could. One could ask, which is more complicated: design patterns and TDD, or building a dev base hundreds of thousands strong and a user base millions strong that's willing to accept major bugs?
I really think this marvelous piece from his post says it all.
Linus Torvalds | 12 Jan 06:20
Re: brcm80211 breakage..
On Wed, Jan 11, 2012 at 8:15 PM, Larry Fingerlwfinger.net> wrote:
>
> I see no difference in the core revisions, etc. to explain why mine should
> work, and yours fail.
Maybe your BIOS firmware sets things up, and the Apple Macbook Air
doesn't? And the driver used to initialize things sufficiently, and
the changes have broken that?
Apple is famous for being contrary. They tend to wire things up oddly,
they don't initialize things in the BIOS (they don't have a BIOS at
all, they use EFI, but even there they use their own abortion of an
EFI rather than what everybody else does), yadda yadda.
But the real point is: it used to work, and now it doesn't. This needs
to get fixed, or it will get reverted.
Linus
I'm not trying to troll Linux or be a jackass, but every time a neckbeard tells me how awesome Linux is and it doesn't need all these "Silver Bullets", I wanna refer them back to this thread. If the patch committer had had an automated regression suite that he could've run before committing, and specific pieces like the MacBook Air wireless driver initialization had been mocked out such that it could've red barred, then the Inventor of Linux's day wouldn't have been ruined on something like this when he OBVIOUSLY has better things to do.
This speaks to a larger point. A sociological one. I replied to my friend as such.
I'd argue that Linux is a byproduct of its time.People who decry the advancement of programming knowledge as "old wine in new bottles" and Silver Bullets don't understand that old systems are like Zombies that eat away at our brainpower.
Unix could've been written in LISP, but the hardware wasn't performant enough yet at the time to handle the case. In the same way that we could build our cars out of carbon nanotubes, except it's too expensive. Different technologies are appropriate at different times. Iron probably existed in the Bronze Age, but it was too expensive and not readily available enough to build weapons, armor, and pottery out of.
So some very clever hackers used the best tools they had at the time, building languages off of FORTRAN like B and C, and made tools out of them. These tools got agglomerated, and eventually became an operating system. It works, and it's good. But "it works" does not necessarily imply optimal.
Linux was built off of Unix because the most popular OSes when Torvalds wrote it were Windows and Macintosh. Which had provable security and stability issues. Unix had a vibrant but small use base in academia, military, etc...Torvalds made it more available for commercial use.
Social attitudes change with time. Manifest Destiny was a popular American mantra in the 1800s. In modern times, were new land to be found and attempted to be colonized even though indigenous tribes lived on it, it is unlikely that the idea of "enlighten the Noble Savage" or "drive the wild animals off the land" would be as socially acceptable.
Ways of doing things change, as well. Henry Ford went very far with the assembly line. The American car companies were very happy with their production model all through WWII, the 1960s, the 1970s...and their lunch got eaten in the 1980s/90s/00s. The Japanese improved on the process models with Lean Manufacturing, Six Sigma, and applying effort to Eliminate Waste and improve efficiency.
One can argue that TDD is not new. REPLs have been around for decades. But there's a difference between writing a program by typing a little bit in REPL until your code works, then throwing your micro-experiments/ code doodles away and using the working code as the finished product, versus storing those "doodlings" as executable tests that assert behavior--which can act as specifications, check boundary conditions, and even influence the design process.
A program, at the end, is just the solved equation. The regression test suite and commit log history shows your work. It stores the Tribal Memory that went into your code as code. It's really hard to overstate how important that is.
Why does your teacher at school not accept assignments that don't show their work? Because the process for arriving at a solution is as important as the solution! We're not talking about "daily standups" and "ScrumMasters" or anything here. We're talking about the tools you used for the job, and the way you applied those tools to solve the problem.
Design Patterns have always existed. Identification and codification of them is merely seeking to create a language to express complex ideas more succinctly. A lot of people believe they're "unnecessary complexity", and make arguments to keep it simple. Ask them:
Do they believe in loops? You don't need a loop structure. You could just write the same code 10 times.
Do they believe in methods? You don't need methods. You could just write it all in int main().
Do they believe in classes? You don't need classes. You could just write it all in one file.
Do they believe in modules? You don't need modules. You could just put all your classes in one place.
Do they believe in inheritance and polymorphism? You don't need those words. You could just say "you write a class that extends another class, and it gets all of the properties and methods of the class it extends." or "you can treat multiple different classes as if they're the same."
So if they believe in all of these things, what's so hard about saying Template Method as opposed to "I write a class where I can define the outline of algorithm and replace specific operations dynamically by extending it and proving a new implementation?"
Your point is well taken. Linux "works" because it has a wide user base and leverages Amazon's Mechanical Turk model or Google PigeonRank.
But is working "good enough?" Or does the community actively accept stagnation if it doesn't seek to incorporate new process models into its ecosystem? It didn't work out that well for American auto. Why should it work well for Linux?
I had a physics research faculty member recently tell me that he had PhD students spend 6 months trying to add a new feature to a piece of code for his dissertation. The student ultimately failed. The feature was ultimately augmenting a method in a deep, nested inheritance chain. We can write code better.
I've heard similar stories that often researchers write papers on works that are published in leading scientific journals based on code simulations that crash 30% of the time. The published results that are advancing the frontiers of science could be repeatable or they could be a random bug.
Calls for utilization of 5 billion cores by the federal government for sequencing genes to do things like fight cancer often have hundreds of hours allocated to programs that may crash half-way through with array indexing issues, may thrash their way through their alloted time.
But maybe it's not a problem. Maybe this really is self-evident and we don't need such "Silver Bullets."
Is it clear to you that PermutationEstimator.cpp's evaluate
method uses an array to represent the traversal of a Graph data structure, where each node is the array index and the next node to go to is the value? That evaluate() is using the Hungarian Algorithm to estimate the lowest cost permutation?
But it is good work. After all, it does work. Is that really how we wanna leave it working?
68% of all software projects fail. Treating modern ways of writing software that seek to make it clearer to read and understand, tease apart dependencies, capture its behavior through automated and repeatable methods, and to overall evolve not just our tools but our very mindsets as "Silver Bullets" is to accept that such is good enough, and should continue. Design Patterns are most certainly not Silver Bullets. They shouldn't be regarded as the answer to every problem. That just leads to Golden Hammer. ;P
Wednesday, March 7, 2012
Big Bugs, small bugs
Consequently, when a problem is found the response should not be "PANIC/BLAME, THERE'S A PROBLEM" but "Hmph. Problem. How long will it take to fix? What does the fix entail? What was the cause of the issue? How can we learn to minimize the recurrence of this problem in the future? How can we at least minimize the impact of its issue?"
...at least for the small problems. When a big problem occurs that's a compound sum of small problems where nobody bothered to ask those questions, I think it's perfectly logical to hold people accountable. Massive failure deserves its recompense.
I work in software teams where the managers become very uncomfortable with the idea of "bugs" being found in their code. I am too. I want my code to be "bug-free." Clean, beautiful, elegant. A work of art on par with the Mona Lisa and the perfect features of Angelina Jolie/Jessica Alba in her time...
But that's really hard, with software. It solves big, complex problems. The bigger and more complex problem you're dealing with, the harder it is to focus on doing the little things perfectly. There are many an abstract algebrician who may forget a sign or two as they're walking through a derivation, many a writer who may forget to use perfect punctuation as they're trying to espouse a greater point.
On one hand, I understand the frustration with this. As a student who's trying to learn calculus, you don't know enough about the material to tell the professor when he's wrong. You're just trying to keep up, and he should've done the little things right in the first place! As a reader of content, I judge a writer who writes things like "it was in there best interests to do that" or "In you're face!" If a writer is as intelligent as her or she thinks her or she is, he or she should've written the content of their piece correctly, in all it's glory, in the first place!
...and yes, that "it's" was intentional for humorous effect, in case your wondering. As was that "your". Gotcha. :P
On the other hand, the "it should've been done right!" argument can only go so far. Humans are not perfect. We will make mistakes, out of sloth, oversight, or just plain ignorance. The idea that we should always "dot our i's and cross our ts" is a good ideal, but difficult to do in practice.
Perhaps it's just harder for me. Some people might call me "sloppy." I argue there's a depth/breadth trade-off. The more varied or broad your concerns, the less time and energy you spend on the execution details of each of those concerns individually. Plus, the more things you try to do, the more opportunity there is for something to go wrong.
Some people may disagree, but this seems to mimic the realities of life.
Thursday, March 1, 2012
Inheritance and Composition: There and back again
I recently got an application request to provide a web interface for a mobile app. The would serve as a demonstration for the mobile app's features. Both apps connect to a cloud datastore that provides a REST API for content.
I've gotten to the point that, whenever I start a new app, I try to think pretty heavily about what could change. I could've hacked and slashed through it, but I wanted to take a few moments to follow the basic Gang of Four design principles.
- Encapsulate what varies
- Program to an interface, not an implementation
As cloud based services are still new, and a relatively unstable market, my immediate thought was:what if we switch from one to another? Should the entire app need to be re-written? I say No.
Consequently, I wanted to encapsulate the variation of datastore. Ideally, my Domain model could remain as free as possible from specific provider concerns. It should focus on just the application function interactions.
I decided to build the app quickly in a framework that I'm comfortable with and enjoy, Grails. Grails boasts an advanced application architecture that really takes the concepts of Fowler's Patterns of Enterprise Application Architecture such as a Service Layer (something Rails does not seem to have out of the box, though it supports something kind of similar in "Helpers"), Data Mapper, Domain Model, Template View, Front Controller, and many other pieces that enable a competent, enterprise level senior developer to comfortably focus on building a god application from solid software engineering principles without having to build everything from scratch.
My initial approach was to handle this encapsulation of concepts in the Service Layer. The controller should only know about a Facade DatastoreService
which internally delegates to the application provider API.
Seeking to follow the concepts of Clean Boundaries from Uncle Bob Martin's Clean Code , I defined my own application level interface for the datastore. I then define a subinterface that extends that interface for the specific application provider, augmenting it with methods to encapsulate the REST API calls. An abstract class implements invariants of that interface and provides an extension point for the implementing service.
At this point,I'm starting to get worried. I seem to be adding a lot of complexity to the application for a very basic desire. Especially when further implementation of functionality made me need to store the 3rd party library's unique indentifier for my subsequent querying. As I tested, built, and refactored, I found my application as internally at odds about how to do this as I was.
On one hand I've used an inheritance hierarchy to store the 3rd party API's object id in my Domain Model. On the other, I have such a slim Facade over the 3rd party API in the service layer that it's just noise. I have two options:
- Remove the
Datastore
service and have theDemoController
directly talk to theParseService
. - Refactor the Domain Model to tease apart the 3rd party API id.
At first, I wasn't comfortable with the concept of option 2. It seemed conceptually strange. A Zapper
IS A ParseEntity
, and so is a ZapCard
. Until I realized that what I'm saying, within the concept of my application, is that Zapper
HAS A ParseIdentity
. In effect, I'm saying that my Domain Model exists outside the context of the datastore,but has a representation within it.
This is conceptually interesting. We are used to thinking about the world in terms of how things are. I am a Person
. I am also a Student
. All Student
s are Person
s, hence a Student
IS A Person
. But I'm not just a Student
. I'm also an Entrepreneur
, Engineer
, Activist
...How can I reconcile all of this without multiple inheritance?
Instead, what if I am a Person
, who HAS A Student
identity, HAS A Engineer
identity, etc etc. Then I can have a single canonical representation, and assume different roles in different contexts. When I have to activate pieces from a different context (such as drawing upon my Student.study()
within an Engineer
context, this process is simplified. Instead of having to cast myself into a different role, my fundamental representation is invariant (Person
) and calls for identity are delegated to the appropriate concept. Or, to flip this in reverse, when I am a Student
I need to access information that I have as a Person
, I access it from my internal Person representation. Like a Russian Egg. Hence,prototypal inheritance built from composition.
Wednesday, February 29, 2012
How many tests would you write for this?
Working through Michael Feathers' Working Effectively With Legacy Code, I find myself wondering how many tests I should write for this.
Naively, I only have a public method. I only need to write tests to validate that method's externally facing API, right?
That feels wrong, though. The private method is doing "interesting things." It's connecting to a web API to update a local cache of data. If the data that's expected is not there after the update,it's telling the API to create that data. Although I've split up work by the methods to provide an easy API for the client view, because the public method internally has different conditions that it performs differently under, I should have tests that stress those different paths:
- When both user and card exist in local cache (already tested)
- When a user exists in local cache, but no card
- When a card exists in local cache, but no user
- When neither exists in local cache
Since I'm only testing the controller, I should mock the datastore
service in each case. Simple enough, though it seems rather surprising that such a seemingly simple method should have so many execution paths. I suspect there is a refactoring hidden here that makes things a bit cleaner. But I can't find it right now.
As I work through such scenarios, I'm starting to see why Uncle Bob Martin says Objects Should Do One Thing in Clean Code. It's easier to test small objects that do one thing with simple tests: did it do it? Did something within an expected set of "untowards" happen?
I'm also starting to see how judicious use of objects starts eliminating branching statements (using an Abstract Factory instead of a giant switch/case for instantiating objects, for example.) As logic is pushed out to self-contained objects that encapsulate it, dynamically configurable behavior (via run-time Dependency Injection) allows for piecing out the number of things that happen in one place and putting them under test in another. The amount of unit tests for individual components becomes small, and all you need is a few integration/functional tests to verify everything knits together correctly at runtime.
It also makes me wonder about the necessity of visibility modifiers(public, private, protected). Essentially, at small enough levels of work, those things perhaps become unnecessary. If private methods should usually be refactored into distinct objects that encapsulate a small piece of work and expose a public API (e.g. DemoController
's private populateFromRemoteData()
should be the PopulationService
's public populate()
method, then it almost becomes unnecessary to have visibiity modification.
Of course, dynamic programming languages have worked under this assumption forever. What if LISP had it right all along? The key is, one has to change one's very style of writing code to truly see this benefits. If one writes giant globs of functions (10+ lines, doing many things) in LISP, you're in the same place as you are with OOP. In which case, you need to have the ability to hide certain things, ignore certain things, just for overall system comprehension.
After all,one of the original purposes of David Parnas' treaty on Information Hiding was that it made software easier to understand, in the design sense. Encapsulation enables reasoning about a system because it abstracts away irrelevant details. Modules are designed to contain cohesive pieces of functionality that expose an interface, in the original Parnas paper.
One can argue that visibility modifiers mean that the imperative style took the wrong approach from his observation. One can argue that classes act as "mini-modules" when they have private methods that are internal and public methods that expose an external interface. Instead, the private methods should be public methods of separate sub components within the module.
This doesn't take into account the "access control" piece of the puzzle, the idea that there's certain invariant information that you don't want clients replacing at runtime because it breaks your functionality. Yet the use of reflection and metaprogramming is only increasing because of the way they dramatically simplify programming. These tools typically blow right past access control. For example, Groovy secretly doesn't respect private. It's a convention that programmers don't use metaprogramming to modify privates.
Perhaps that really is a case of "you break it, you bought it?" But then what are the implications for software security models?