Wednesday, January 27, 2010

My Tags

With nearly every content-management social networking site providing some sort of tagging feature, one finds themselves with many site-specific tag "vocabularies". I feel that an individual should have a single tag vocabulary that is accessible from each site, so that the tag "suggestion" feature on each site can draw from a single vocabulary. I'm also finding simple tags to be somewhat too simplistic, in that I would like to maintain a tag hierarchy so that a given tag implies its parent(s). And possibly other meta data about my tags (comments, term disambiguation, composite tags, etc.). Having a dedicated tag vocabulary system would allow an individual to maintain such structure and meta data for his tags. "Consumer" sites of these tags would not have to each implement their own versions of advanced tagging features. Such a system might also facilitate more interesting cross-site analyses of content, for a given user.

Google Reader "sort by magic"

Just ran across the "sort by magic" feature on Google Reader. I always thought a news reader app should help you find the articles you're interested in, and now it seems Google has delivered on that idea. I look forward to seeing how well this works. All it requires, apparently, is that your mark articles as "liked". Arguably, it should also take into account starred and shared items as well, right? For now, I'm marking my starred items as "liked" (recent ones anyway). Fortunately, you can apply this option to individual feeds, and I only intend to use it on the high volume, heterogeneous feeds (e.g., ServerSide, InfoQ, etc.).

Friday, January 22, 2010

Hibernate Hawthorne Effect

I really enjoy how the act of debugging Hibernate LazyInitializationExceptions is often subject to the Hawthorne Effect. While using a debugger, the very act of inspecting an entity can cause Hibernate to fetch the entity relationships that would otherwise remain uninitialized (the code must be executed within the scope of an active Hibernate session). The fact that extra relationships are being fetched in response to debugging, can cause LazyInitializationExceptions to disappear.

Tuesday, January 19, 2010

Scala string reversal gotcha

Be careful when invoking Scala's RichString methods on a java.lang.String (via implicit conversion):

scala> "8008".reverse == "8008"
res0: Boolean = false
scala> "8008".reverse
res1: scala.runtime.RichString = 8008
scala> "8008"
res2: java.lang.String = 8008
"8008".reverse == "8008".toSeq
res3: Boolean = true

That toSeq call is unfortunate, but necessary.  I suppose this is the price we pay for having Scala String literals be typed as Java Strings.

Friday, January 15, 2010

apgdiff

I encountered a defect yesterday that was caused by our production database schema being out-of-sync with the domain model. Our schema has undergone numerous migrations over the course of a few years. Each migration involves applying manually-written SQL scripts that update the schema while migrating the extant data to the new schema. It was inevitable that some schema change was going to be omitted from a migration script, and that's in fact what happened. So I decided it was time to perform a full schema diff (actual versus expected) to see if we had any other hidden time bombs. I exported the schema from both a newly created database (schema only) and the production database. But one cannot perform a straightforward diff on these schema SQL representations, since ordering of objects (tables, triggers, etc.) and nested items (table fields) may be different. So I found apgdiff, which performs the necessary "intelligent" diff. Instead of standard diff output, it outputs the SQL needed to bring the old schema up-to-date with the newer one. After tracking a NPE in its source code (caused by a missing create table statement and an associated alter table statement), it worked. It's a nice, straightforward tool that I will integrate into my development process. In particular it will save me time when the next schema migration script needs to be written.
Just found a reason to use Spring Framework's "Lookup Method Injection" feature. Our system had a parser bean that recently began to maintain state, and so could not be reused without remembering to reset the state. It seemed more pure and safer to simply instantiate a new parser on each use. Even though the parser was configured as a prototype Spring bean, it would only be instantiated once (for a given web user session), since its parent "loader" bean was a singleton. Lookup method injection allowed Spring to provide the loader with a new instance of the parser for each usage, and without creating Spring library dependencies.

Wednesday, January 6, 2010

Hibernate Letdowns

As the number of abstraction layers in a software architecture increases, the expressiveness and clarity and of the code increases, but often at the cost of lost functionality. Hibernate has now bitten me twice in this way. First, with the Criteria API's inability to join an entity/ relation more than once to a query and more recently with the inability to access the underlying databases' full set of aggregate functions.

The multiple-join problem ultimately forced me to write my own HqlBuilder class, so that I could maintain some semblance of order in a code base the dynamically creates queries. (Yes, HQL supports these multiple joins, while Criteria queries do not!)

The aggregate function problem prevented the filtering of results from an aggregate query where the aggregated group does not contain a given value in a aggregated field. More precisely, I could not use having every(field<>val) since HQL only supports min, max, and count functions. I figured I could alternately use count(field=val)=0, but alas HQL will not let interesting expressions with the count() function.

Lesson: choose your technology stack carefully! Or as corollary, understand its limitations and how this will impact your ability to meet user requirements.