Thursday, September 23, 2010

Software engineering, law, and politics, rev 0.1

Modern political and legal systems have been relatively slow to adopt new technologies, considering the enormity of opportunity. Potential benefits include potential simplifications to process and workflow that would allow such systems to function more efficiently, convenience for end users, convenience for administrators, auditing and tracking capabilities, and more agile adaptation to citizen's needs.[insert sources here]

Imagine, if you will, if we could automatically parse the text of existing laws and find contradictions. These could be temporal discrepancies (e.g. law proposed in 2010 contradicts law passed in 2002,found automatically before law has chance to pass, lawsuit must be filed,etc etc) [find historical source], scope discrepancies (city law violates state or federal law) [find historical source], or even logical discrepancies [find historical source](law A implies behavior zeta is permissible but not behavior gamma, law B strengthens law A while adding clause mu, law C strikes B,keeps mu, but somehow allows gamma.)

Furthermore, such a system could proactively help "prune" legal systems by finding laws that "don't make sense", e.g. the stereotypical donkey in a bathtub [find historical source] and other archaic or arcane statutes. It could then put those measures up for vote, and citizens could proactively "remove the dross" to make legal systems leaner, easier to approach, and more receptive to the wants of the populace.

Such a system may seem far-fetched. It's not. Although the level of natural language processing necessary to evaluate some of the more semantic questions may be a little bit off, we can already do many of these things with already available tools. The existence of search engines such as Google show that although a computer can't really understand what the user is thinking,it can be leveraged as a tool to do very powerful things.

But do our political and legal systems need this type of fixing? Are current processes good enough? Most people would say no [insert sources here]. But are the potential benefits worth the potential costs?

Many people fear the proactive application of technology in the tasks related to the public space. For good reason. At it's worst, technology could be used to create a 1984-esque dystopian society that allows the ruling elite to fully monitor and collect information about everyone, which would enable elimination of dissidents and suppression of freedom. This would not be hard to do. With the rise of biometrics, RFID ,ubiquitous monitoring via omnipresent internet access and web cameras,etc, we already have the technology. [insert sources] Is it worth pursuing digitization and technological integration, when the costs could be so catastrophic?

As freedom and security are competing aims, societies that value both must make these decisions carefully. One could argue, however, that bloated systems rife with inefficiency and unnecessary complexity threaten both freedom and security. Processes that are unwieldy lead to more systemic failures. We have already seen this in software.

When software engineering first became a topic of discussion, in the 1960s[site paper that I forgot],it's main concern was the abysmal failure rate of software projects.[insert number that I forgot] Too many projects suffered from budget or scheduling overruns, if not outright failure, and better approaches were sought. Through the years, various models have been proposed.

What became the classical model is now known as "The Waterfall Model." [cite paper] In the waterfall model, a software project is partitioned into distinct phases: Requirements elicitation, Design, Implementation, Testing, Deployment,Maintenance. Each phase is to be completed fully up front, so that the software process always "flows down." Heavy documentation and record keeping is relied upon to transfer information between phases,and once a phase is complete it is to be "set in stone." It is ironic that this became the standard model, as the original author's intent was to argue against such a process. [cite source] In fact,the author argued, a better approach involved iterative refinements since it is usually impossible to get everything required up front in any but the smallest and simplest projects. Time has proven that criticism correct. In the 1970s/1980s,software engineering saw a move towards a Spiral model [cite paper], where each phase fed into each other and then looped back around at the end,iteratively, to develop quality software. This alone wasn't enough, and as the complexity of the software problem has increased the models have as well, moving to the Rational Unified Process[cite paper]and most recently Agile methodologies.[cite Agile Manifesto]

The key progression has been to "cut down red tape" in terms of heavy, unwieldy paperwork and processes that seek to "do it all in one go" and instead instill values of productivity and adaptation. It can be argued that some camps have gone too far with Agile, sacrificing discipline in the form of basic documentation and creating chaos[cite anti-Agile source], and software projects are nowhere near perfect [cite source],but Agile is clearly the direction the industry is moving in and studies suggest it is improving things.[cite source that shows improvement thanks to Agile]
This is not surprising, either. Given the rise of the internet and the growing inter-connectivity, being able to respond quickly is more important than ever before. This need will only increase with time.

One can argue that the United States' political and legal system currently follows a "waterfall" type of model. The current legal process is arduous, whether you're a defendant going to a court date to schedule another court date or a politician trying to pass an initiative. An explosion of structure has resulted in a morass of paperwork and processes that are not nimble enough to react to changing social conditions [cite source]. These processes are rife with opportunities for failure and corruption. [strengthen this a little more]

Furthermore, the products of "waterfall" style development tended to be large,monolithic pieces of software. Gigantic, complex, arcane laws are not only common, they are prevalent. For example, see The PATRIOT act [cite source],the Stimulus package [cite source], the Health Care Reform act[cite source], etc. The average citizen will not read a thousand page bill with hundreds of provisions, which are often written in difficult to decipher language(by design). These bills themselves are often littered with extraneous provisions that have nothing to do with the original intent of the bill, labeled "riders" or "pork." For examples, see the Safe Port Act-- which on an unrelated note crippled online gambling-- [cite source] and the recently failed Defense appropriations bill that included provisions to end "Don't Ask,Don't Tell"and the DREAM Act.[cite source]

This drives a software engineer crazy. Knowing basic concepts of how to represent things (data structures) and to get things done (algorithms),software engineers then learn the principles of effective systems design. An experienced engineer could describe the average current bill as one with low cohesion[describe cohesion]and tight coupling[describe coupling]. The process by which lawmakers create, inspect, and debate bills could be described as inefficient resource allocation that leads to deadlock [describe deadlock] and starvation[describe starvation].

But how did the system get this way? A good historical background can easily answer this question: by design and by the tension of conflicting forces.

The Federal Papers[cite source] and other assorted musings from the Founding Fathers, primarily the works of Madison and Jefferson, reveal a distrust of strong governmental structures. Checks and Balances were established for the very purpose of governmental inefficiency, with the thought being that an inefficient government that does nothing is still better than an efficient one that tyrannizes it's people.[cite source] This is not surprising, given their historical context and rationale for the Revolution, and can still be argued to be true given the even greater capacity for tyranny in the modern context.

The Jeffersonian argument favors a small, limited federal governmental structure that respects the rights of it's citizens to such esteem that it does little itself, delegates responsibility to the states for most actions, and presides over a limited and fairly specific set of cross-cutting concerns like international trade and militaristic mobilization. [cite sources. Possible paper idea: system architectural lessons learned from a historical analysis of governmental structures] Ironic, then, that Jefferson himself signed the Louisiana Purchase--the largest expansion of federal power and land mass in American history[http://americanhistory.about.com/od/thomasjefferson/a/tj_lapurchase.htm].

This may be due to influences from a competing system design, the Hamiltonian argument. Hamilton, seeing the potential for US expansion and growth, favored a stronger centralized federal government structure that handled a larger set of cross-cutting concerns to establish and grow the ability of the government including banking and cultivation of industry. A Hamiltonian style of design necessarily requires greater bureaucracy in order to do the "book-keeping" that handles the interconnected details of a more powerful system. Consequently he argued for an expansion of federal power.

The conflicting nature of these two ideals leading to lack of cohesion within the system, which has oscillated in different directions over the course of the years, combined with an original emphasis on balanced power, make it unsurprising that the system is difficult to modify, difficult to maintain, and difficult to evolve.

But as we've established, it is possible, advantageous, and ultimately necessary for systems to adapt and be modified in order to meet changing requirements. In the software world, we call this refactoring.[explain refactoring. Source] Systems that do not evolve become crushed by what Grady Booch terms "inertia" [source Podcast "On Architecture"] and "code rot."

The system must evolve, if it is to survive. But it must do so in carefully measured steps that follow logically and can be tested for equivalent fulfillment of requirements, if it is not to fall apart. Software can help meet these needs, but in order to do so it must ensure the fundamental pillars of the underlying political system

Participation -- Citizens must have the ability to make their voices heard
Collaboration -- Citizens must be able to unite and act in concert
Security -- Citizens must be protected from misuses of power,or attacks on the system from external forces
Conflict Resolution -- Citizens must be able to disagree lawfully without fear of reprisal
Reliability -- Citizens must be able to trust in the process
Transparency -- Citizens must be able to trust the accuracy of results via verification

[brainstorm more. Try to find supporting examples]

Luckily, software can do this. In fact,in many application domains--such as defense, avionics,and medical applications-- it makes guarantees similar to these and stronger. Consequently,it is not a question of whether software intensive systems can fulfill these requirements, but rather how they are to be composed.

Take the issue of voting, for example. Many electronic voting systems have been proposed,implemented,and analyzed since the 1960s, and this application area has received even more attention with the rise of the internet.[http://en.wikipedia.org/wiki/Electronic_voting] The reason is simple, voter turnout is relatively low on local,state, and national elections.[cite source] It is not nearly as low on online surveys and Facebook polls [cite source]. The most common complaint about the voting process as it stands today,whether it be in person at the polls or via mail, is inconvenience and ease of forgetting. Some would argue the civic duty is virtuous because it is not convenient and requires energy, but that is a philosophical/design discussion that doesn't mirror the reality of the general public's dissatisfaction that Washington isn't "hearing our voices." [cite source]

A system that could enable more accurate, secure,and reliable near real-time response in the form of votes to pressing issues for politicians from constituents could dramatically change the way discussions are framed. Such a system could be sufficiently easy and convenient for a voter to use, especially in these days of ubiquitous connectivity via web interfaces and mobile devices. This system could increase participation and involvement and offer both law-makers and citizens never before seen guarantees of authenticity, accuracy, and transparency.

People are nervous about implementing such a system for one major reason: security. There are many security considerations for such a system, including:
-- Anonymity : how can the system guarantee the safety of the voter from reprisal while logging the voter's vote for both tally and content
-- Robustness: how can the system handle usage without failing so votes aren't "lost",aka denial of service [explain denial-of-service]
-- Security: how can the system secure itself against man-in-the-middle attacks [explain man in the middle], spoofing [explain spoofing], playback [explain playback attacks]
-- Integrity: how can the system ensure logical constraints such as no voter voting twice, all votes being equal, all voters being authorized and authenticated [explain authorization and authentication]?
-- Correctness: how can the system ensure that the correct vote is cast? How can a voter verify their vote without anyone else being able to? How can an auditor tell that a person has voted,without being able to see what the vote was?
[list more. Read a few papers]

...This seems like a hard problem space. In many ways, it is. Yet it's not impossible. Many of these same constraints are faced every day in the processing of credit card transactions, biomedical data, and entertainment sources[see DRM] and other types of data. The potential failings of a system could be catastrophic if implemented incorrectly, but because the system is inherently a public good there is actually higher probability of it being correct [cite bug count analysis in terms of number and fixes of open source versus proprietary software of comparable size and functionality].

Let us propose, at a high level, the implementation of such a system.

Since the system is a public good and must be verifiable such that it inspires the confidence of it's citizens, an almost immediate initial requirement is that the system must be open source. Rather than having to rely on "unbiased mediators" (who may or may not be unbiased), citizens should be able to download the software and verify it's operation themselves. They should be able to verify that their is no logic that compromises their identity, discards their vote,or incorrectly tabulates the results. Luckily, this can be done fairly simply through the use of cryptographic hashing.

The software can be even more transparent and secure with good logging and audit output that echoes every line of code as it's about to be run,such that anyone who can read the language can analyze it.If the software is written in a good Domain Specific Language (which itself has a published implementation detail) then the code output can be readable to almost anyone.

In order to secure voter identity,a public/private key mechanism could be used with a small scale TPM stored on a person's driver's license.Perhaps integrate RFID. Or steganography with a 3D barcode. [elaborate]

Anonymity and confidentiality can be assured by creating an SSL tunnel between a citizen's computer and a web server. Perhaps using a Tor-style anonymous network to connect to the Front Controller that contains the public key of registered citizens such that it can provide authentication and verification before allowing access to a back end data structure that generates a new public/private key combo for the citizen when actually casting a vote that is used solely for correctness verification purposes. [elaborate more]

With such an architecture,we can provide all of the benefits of current voting systems while creating an even more transparent and auditable process.

[write strong conclusion]

1 comment:

  1. imagine a world where you can register your MAC/IP address for the sole purpose of voting with your DMV, a week/day/hour before you vote, and be sure that the record is deleted when you are finished
    where you can then log on via SSL to an anonymized network
    to complete a webform (randomized, like the ballots in the TED talk)
    submit, receive a receipt...and using PGP private/public key validation, can verify and analyze your result

    but there's one catch behind that, though.
    at least, one catch that I see.
    IPs are tracable.
    Emminently so.

    hence, anonymized network

    MAC addresses less so, but they are universally unique.

    yes, these are how we can ensure that "only valid voters" can vote
    and that each vote can be 1:1 with each "user"

    I dunno.
    I like the idea, but I know enough people in IT and data security to be more than a little leery of it.

    equivocation is not engineering, good sir

    hm?

    we must identify and enumerate the vulnerabilities with the plan, then build around them. I fully suspect we have the technology to prevent and secure this type of system from 98.231% of attacks/manipulations
    it just hasn't ben applied properly
    a bold hunch, but I want the data experts to show me how wrong I am, especially in comparison to the vast inefficiencies and trust issues endemic in the current manual process

    That is a pont.
    *point.
    I dunno. I'm just imagining a central database as opposed to hard copies across the country.
    even with the best security, all it takes is one hacker team.
    Then again, if you have government-level resources, you can put together quite a bit.
    and if it's wide open, tracability provides its own security.
    But still. I worry.

    you could actually distribute such a system
    but you would have to security the communication lines between subsystems
    which may be just as difficult a problem
    for example, you could make it small, distributed regional databases (say by district, exactly how the boundaries are drawn up now) that do local calculation and send only pertinent info to the higher level
    i.e. encapsulation

    /me nods
    that would do a nice job of localizing the attacks.
    turning it into a million little ants to attack instead of one giant castle.

    1:04pm
    yeah, the whole centralized server/distributed system debate is a long running one (just as much in IT as it is in gov't) and both have their pros and cons
    I'm more of a distributed fan, myself

    mhm

    as long as the central one provides the necessary interfaces and abstract protections that the local ones can use to ensure proper operation '

    ReplyDelete