Saturday, September 10, 2011

Sweet Mother of God, Grails is Awesome

A few years ago I did a study on computer architecture education as a preliminary swing on "getting familiar with engineering education research." Being inexperienced and short on time, I collected my data manually (with a giant Excel spreadsheet). Later, when I wanted to analyze the data, I decided to be smart and push the data into a MySQL database so I could query it with SQL. I even wrote an ugly and basic web application in CodeIgniter (back when anything beyond basic HTML forms was intimidating) for professors to update the data.

After the study, I got sidetracked on other projects. The data lived on the server for a long time, until I got a notification that the server was being decommissioned. The most valuable thing for a researcher may well be the data, so I quickly wrote a SQL dump out and put it in my archives.

I recently found myself in a position to extend the original study and use the data again. Being more experienced, I've decided to extend the original schema and build a better application in Grails. The infrastructure building was easy, but secretly I worried.

I knew I would have to write a program to parse the SQL dump, translate the columns, and write in the data...While not conceptually difficult, those types of "data migration" applications are easily prone to errors, may take multiple steps to write...what a pain.

Or so I thought.

Tonight, I found myself testing my Grails 2.0 skeleton by putting fake data in my Bootstrap.groovy and thought: "You know what? I should see if I can load the data from a file in here." Would I lose a whole bunch of time trying to figure out how to translate a SQL dump into domain class instance I could save?

Not with Grails. The awesome, awesome power of Grails.

First, rather than muck with string manipulation or finding some magical SQL-> instance translator, I decided to load up my data into a MySQL instance and then dump it back out to CSV. I probably could've string/regex-searched through the statements, but generally, the less text to process the better, so that worked better for me.

Then I leveraged Glen Smith's excellent OpenCSV. Here, I decided to be tricky. I'd never tried to actually leverage Maven correctly before, but inspired by the example of the Mysql connector
in the Grails 2.0 BuildConfig.groovy, I thought:

"Hey, maybe I don't have to download the JAR and put it in the lib directory. Maybe I can try this whole 'automatic dependency resolution' thing."

So I did the following in my BuildConfig.groovy
repositories {
inherits true // Whether to inherit repository definitions from plugins
grailsPlugins()
grailsHome()
grailsCentral()

// uncomment these to enable remote dependency resolution from public Maven repositories
mavenCentral()
//mavenLocal()
//mavenRepo "http://snapshots.repository.codehaus.org"
//mavenRepo "http://repository.codehaus.org"
mavenRepo "http://download.java.net/maven/2/"
//mavenRepo "http://repository.jboss.com/maven2/"
}
dependencies {
// specify dependencies here under either 'build', 'compile', 'runtime', 'test' or 'provided' scopes eg.

// runtime 'mysql:mysql-connector-java:5.1.16'
runtime 'net.sf.opencsv:opencsv:2.3'
}

My tummy bumbled a little bit as my IDE kept yelling at me that my classes were undefined. As much fun as dynamic languages are, there's a comfort to not having those red squiggles telling you you fail at basic namespace resolution...but I resisted the urge to revert to the old school way. I ran the app...Happily, it did in fact download the dependency at runtime!



With the dependency resolved (hopefully), I turned my attention to loading the data in Bootstrap.groovy. Reading the bit on JavaBean binding gave me an idea...but I remained skeptical. Could it be so easy? On first run, the ColumnMappingStrategy didn't work. The Grails error log told me that the 2 of the values of my instance class that I expected to be filled were null. But 1 was not.



class Institution {
String name
String carnegieClassification
String schedule
static constraints = {
name blank:false, unique:true
carnegieClassification nullable:true
schedule inList:(["semester","quarter"])
}
}

universityName carnegieClassification schedule
Brown University RU/VH: Research Universities (very high research activity) semester


Ah, the carnegieClassification was binding correctly, but the other two fields were not because of mismatch in the named columns. Would the JavaBean thing fail utterly?



Nope.



I minor edit later...


def createInstitutions() {
CSVReader reader = new CSVReader(new FileReader("universities.csv"));
String [] nextLine;
HeaderColumnNameTranslateMappingStrategy strat = new HeaderColumnNameTranslateMappingStrategy();
strat.setType(Institution.class);
strat.setColumnMapping(["universityName":"name", "carnegieClassification":"carnegieClassification", "schedule":"schedule"]);

CsvToBean csv = new CsvToBean();
List list = csv.parse(strat, reader);
list.each {
def already = Institution.findByName(it.name) ?: it.save(failOnError:true)
}
}


Just like that, part of the old data loaded and a roadmap found for the rest of it. No fuss, no muss, in less than 10 minutes. Wow.

Less than 3, Grails. Less than 3.

No comments:

Post a Comment