Monday, September 27, 2010

Configuring Spring-managed Hibernate event listeners with EntityManagerFactory

I'm in the process of migrating my work's web application from a pure Hibernate persistence architecture over to a JPA-based architecture (with Hibernate as the JPA implementation provider). The application uses Spring XML context files for its configuration. Previously, the persistence-related configuration defined a couple of custom Hibernate event listeners, as Spring-defined beans, and these were passed along to Spring's LocalSessionFactoryBean via its eventListeners property in a straightforward manner:
  <bean id="hibernateSessionFactory" class="org.springframework.orm.hibernate3.annotation.LocalSessionFactoryBean">
    ...
    <property name="eventListeners">
    <map>
      <entry key="post-load">
        <list>
          <bean ref="myPostLoadEventListenerBean" />
        </list>
    </map>
    </property>
  </bean>

However, after moving to a JPA-based configuration, and replacing the Hibernate-specific LocalSessionFactoryBean with a JPA-specific LocalContainerEntityManagerFactoryBean, the only way to declaratively specify the Hibernate event listeners is via the jpaPropertyMap. However, with this method, the event listeners can no longer be Spring-instantiated objects, but rather can only be specified as class names, which Hibernate will "conveniently" use to instantiate the event listener objects on its own. This is hardly convenient, and in the case where one's custom event listener classes rely upon Spring's dependency injection for initialization, Hibernate's initialization mechanism is downright limiting. What we would like to do, but cannot, is specify our event listener beans within the JPA-compliant jpaPropertyMap, as follows:
  <bean id="entityManagerFactory" class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean">
    ...
    <property name="persistenceProvider">
      <bean class="org.hibernate.ejb.ConfigurableListenerBeansHibernatePersistence">
        <property name="postLoadEventListeners">
          <list>
            <bean ref="myPostLoadEventListener" />
          </list>
        </property>
      </bean>
    </property>
    <property name="jpaPropertyMap">
      <map>
         <entry key="hibernate.ejb.event.post-load">
           <list>
             <bean class="org.hibernate.ejb.event.EJB3PostLoadEventListener" />
             <bean ref="myPostLoadEventListener" />
           </list>
         </entry>
      </map>
    </property>
  </bean>
The limitation is ultimately introduced by Hibernate's EventListenerConfigurator, which only knows how to handle event listener class names, rather than event listener objects. That's fine, I suppose, but Spring should provide some factory class of its own that allows the configuration of JPA provider-specific properties (such as Hibernate's event listeners), to get around this limitation by taking the desired beans and then programmatically updating the underlying PersistenceProvider. But Spring does not provide this, as the LocalContainerEntityManagerFactoryBean simply passes along the jpaProperties without further consideration. In the end, Spring and Hibernate classes collude in a such a way that there is no way to accomplish what was easily done with the Hibernate-specific Spring configuration.

The solution I've come up with involves replacing Hibernate's HibernatePersistence class with an extended version of the class, ConfigurableListenerBeansHibernatePersistence, that specifically allows event listeners to be specified as objects, thus allowing Spring-instantiated bean to be injected. We then tell Spring's LocalContainerEntityManagerFactoryBean to use our special HibernatePersistence implementation, instead of the default HibernatePersistence class. Yes, this sidesteps the JPA-compliant interfaces, but I find no other way to accomplish the required injection of event listeners. The new Spring XML configuration looks like this:
  <bean id="entityManagerFactory" class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean">
    ...
    <property name="persistenceProvider">
      <bean class="org.hibernate.ejb.ConfigurableListenerBeansHibernatePersistence">
        <property name="postLoadEventListeners">
          <list>
            <bean ref="myPostLoadEventListener" />
          </list>
        </property>
      </bean>
    </property>
    <property name="jpaPropertyMap">
      <map>
      </map>
    </property>
  </bean>

Seems like Spring should provide a similar solution out-of-the-box.

Note that ConfigurableListenerBeansHibernatePersistence also takes it upon itself to maintain the default event listeners that Hibernate normally creates on its own (Ejb3*EventListeners and various Default*EventListeners), and appends the application-provided listeners to these. Contrast this with the default configuration behavior, which requires a developer to specify the Hibernate-provided event listeners in addition to the listeners being added by the application. (Hibernate's approach is the most flexible, but a pain for developers that simply want to add new listeners that do not alter the underlying behavior of Hibernate.)

Finally, note that current version of ConfigurableListenerBeansHibernatePersistence only provides configurability of the PostLoadEvent for now.

Monday, September 13, 2010

Two UNIX tips

Picked up two UNIX tips at http://www.ffmpeg.org/faq.html#SEC27, which I'd like to remember:
  • {tail,head} commands take a {+,-} modifier on their -n and -c options, which causes the command to skip the {leading,trailing} N lines or bytes of the input.
  • A bash "compound command" that is defined using flanking "{" and "}" characters can be used to apply the backgrounding operator to a list of commands

Thursday, August 26, 2010

The act of committing

The act of committing a new revision of one's code to a source control repository seems like a rather coarse operation.  When a developer makes a commit, he is in a way declaring that his code is "ready" for others to view or use.  If he commits to a feature branch or experimental branch, he is declaring that his code is ready at some initial level, but perhaps not ready for integration with a project's trunk or a team-wide release branch.  When a project trunk or release branch is declared ready for use, it is often "committed" to a new branch or tag, declaring its status as high-quality, tested code.

At the other end of the version control spectrum, developers often have access to a "local revision history" (e.g., as provide by Eclipse), where every save of a source file is tracked locally, and can be compared to other local revisions and may be reverted to an earlier revision as required.

With distributed version control systems, we have an intermediate level of committing, whereby commits are made to a local, privately-managed repository that can be optionally shared with others.  Such version control systems seem to encourage more frequent committing, as a means of safely recording one's development progress, without necessarily making one's efforts immediately public.

So I feel like we have a discrete levels of versioning functionality, which suggests the possibility of an alternate versioning model based on a more continuous spectrum of commit actions.

What if every change made to source code is tracked by the developer's versioning system?  And then, what if, at explicit times, a developer can choose to declare his code as being at a particular level of quality.  For example, while developing a new feature, the developer might be able to declare his code as "compiles", and later, "passes tests", and later still "functional", then "beta", "Q/A tested", and finally "released" (perhaps set by a release engineer), etc.  The idea here is that we don't force the developer into a single commit/no-commit decision point.  Instead, the developer can communicate the level of readiness of his code as it evolves.  Different levels of readiness can be configured to be kept private for different audiences.  So, for example, if the developer is working closely on a feature with one other developer, the two can see each others changes' at any point, or may chose to see only the changes that are "compilable", etc.  Different subteams on the same project may only want to be exposed to code that "passes tests", while developers on other projects that depend upon the code in this project may only want "Q/A tested" or "released" versions.

Such a versioning model would allow maximal sharing of code to appropriate audiences, while ensuring that the desired level of code quality/readiness is made available.  Developers wouldn't have to "call over the cubical wall" to ask his fellow developers if they have committed a particular change yet.  A developer could choose to see even the most ill-prepared code and keep tabs on it as it progresses, or even jump in and help out with edits on code that has just been entered by a teammate.  This has interesting implications for (non-co-located) team members that want to review each other's work or even engage in remote pair-program activities.

But the fundamental idea here is to increase the granularity of "commits", such that developers do not keep their code inaccessible from others for too long, while at the same time, preventing low-quality code from being exposed to users that would be harmed by it.  This seems like a safer alternative to committing infrequently, as it allows developers to not worry about publicizing their work that may not be ready for prime time.  Such a version control system would become a combination of "local revision history", "distributed version control", and "public commits" (to a central server).  It would encourage  developers to be more explicit about publicly declaring the state that a particular revision of their code is at.  It would avoid the "ready" versus "not ready" binary proposition that is forced upon us when using the commit model of current version control systems.  And it would maximize visibility and sharing of code that is under active development.

Wednesday, June 30, 2010

Testing JSF backing beans

No matter what level of unit and integration testing is performed on one's code, there still seems to be no substitute for running tests directly from the user interface.  And while I've tried my hand at automating the UI tests for web applications, I find the task quite tedious and the scripts/code difficult to maintain.  So I find myself having to manually perform a battery of UI tests on my web application, enduring the dreaded Tomcat deploy/restart cycle (no snickering from the Ruby on Rails audience please!).

In particular, I find that the use of Hibernate and Spring's declarative transaction management makes it difficult to guarantee within my integration tests that entity objects are always being properly managed by the current Hibernate session.  That is, ensuring that LazyInitializationExceptions, NonUniqueObjectExceptions, etc., will not be thrown for a given call stack of UI backing bean and service methods.  This is difficult to test for "outside of the container", in a standalone integration test.  The complexity stems from the fact that backing bean methods (the entry points of UI actions) and the service layer methods that they call can each be transactional.  So the developer must guarantee that the entities being passed into a transactional method comply with the method's expectations for the "persistent", "detached", or "transient" state of these entities.  Things get really ugly when the domain model's save/update/persist cascades are different than what are needed to reattach all of the related entities that are needed by the method being called.

In my web application, I use Spring to instantiate all of the JSF backing bean objects, making use of the Spring-provided DelegatingVariableResolver.  (This allows the backing beans to be proxied and endowed with Spring's AOP functionality, for declarative transactions, logging aspects, etc.)  But testing backing beans with this design is made difficult by the fact that they must be instantiated by Spring within the test. This is solved by using AbstractDependencyInjectionSpringContextTests, which allows us to create a Spring context from which we can retrieve our backing beans.  However, this is still not enough, since in my case, the backing beans use "session" scope, and so normally require that they are instantiated within a servlet container, or at least that a FacesContext can be acquired.

To avoid the servlet container/FacesContext requirement, I figured out that I could create a MockSessionScope that can emulate a single extant servlet session, without JSF being initialized. This allows us to instantiate and retrieve our JSF backing bean objects from Spring, within our integration tests, and make calls on the backing bean methods with all AOP behavior enabled.  In this way, we can recreate the full call stack into our application, as if our JSF framework was calling the backing bean directly in response to an HTTP request.  And without using a browser client (real or headless).  Most significantly, we are now able to test the full transactional context that exists when our backing bean and service methods are called, allowing us to detect and debug problems merely by running our integration tests.  No more Tomcat deploy/restart/manually-testing cycle!

To make use of this we simply need to define a CustomScopeConfigurer in our testing spring context configuration that specifies the MockSessionScope for the "session" scope:

<bean
  class="org.springframework.beans.factory.config.CustomScopeConfigurer">
  <property name="scopes">
    <map>
      <entry key="session">
        <bean class="edu.harvard.med.screensaver.ui.MockSessionScope"/>
      </entry>
    </map>
  </property>
</bean>


The one thing that still is not exercised by this testing design is the rendering of the web pages, which can still be a source of problems.  Note that it is also necessary to ensure that the UI layer code being tested does depend upon a FacesContext or an HttpSession.  For the former, see various approaches suggested by others. For the latter, one can use Spring's MockHttpSession, as necessary.  The above design thus has some drawbacks, but we have at least raised the level of our tests one step closer to automated UI testing, and without the pain.

Tuesday, May 18, 2010

Maintaining legacy data is hard

I smiled when I saw that even the folks at Flickr admit to having trouble maintaining their legacy data.

On my work project, the most difficult aspect to maintaining legacy data is usually caused by the evolving domain model constraints.  In particular, when you try to create explicit relationships among entities that previously were only inferred, or not previously required.  The legacy data invariably seems to violate the relationship constraints, forcing the model to allow for and handle missing relationships.

For those familiar with the Screensaver domain model, this occurred when migrating the PlatesUsed entity to AssayPlate.  PlatesUsed did not have an explicit relationship with Copy, but only maintained the Copy name, as a text value.  When the PlatesUsed entities were converted into AssayPlate entities, there were missing Copy name values.  Furhtermore, many AssayPlates were not defined at all, even though they were screened.  This forced us to create AssayPlates for which the Copy is unknown.  In both cases, the domain model had to allow for and handle AssayPlates that did not have an associated Copy.  But since every AssayPlate absolutely needed to know its library plate number, we had to store that redundantly in AssayPlate, even though it could be determined via AssayPlate.Copy.Plate.plateNumber.

Sunday, March 28, 2010

In case you you're thinking of creating yet another web link sharing service...

TSS has the most comprehensive set of "share" links I've ever seen: