Thursday, August 26, 2010

The act of committing

The act of committing a new revision of one's code to a source control repository seems like a rather coarse operation.  When a developer makes a commit, he is in a way declaring that his code is "ready" for others to view or use.  If he commits to a feature branch or experimental branch, he is declaring that his code is ready at some initial level, but perhaps not ready for integration with a project's trunk or a team-wide release branch.  When a project trunk or release branch is declared ready for use, it is often "committed" to a new branch or tag, declaring its status as high-quality, tested code.

At the other end of the version control spectrum, developers often have access to a "local revision history" (e.g., as provide by Eclipse), where every save of a source file is tracked locally, and can be compared to other local revisions and may be reverted to an earlier revision as required.

With distributed version control systems, we have an intermediate level of committing, whereby commits are made to a local, privately-managed repository that can be optionally shared with others.  Such version control systems seem to encourage more frequent committing, as a means of safely recording one's development progress, without necessarily making one's efforts immediately public.

So I feel like we have a discrete levels of versioning functionality, which suggests the possibility of an alternate versioning model based on a more continuous spectrum of commit actions.

What if every change made to source code is tracked by the developer's versioning system?  And then, what if, at explicit times, a developer can choose to declare his code as being at a particular level of quality.  For example, while developing a new feature, the developer might be able to declare his code as "compiles", and later, "passes tests", and later still "functional", then "beta", "Q/A tested", and finally "released" (perhaps set by a release engineer), etc.  The idea here is that we don't force the developer into a single commit/no-commit decision point.  Instead, the developer can communicate the level of readiness of his code as it evolves.  Different levels of readiness can be configured to be kept private for different audiences.  So, for example, if the developer is working closely on a feature with one other developer, the two can see each others changes' at any point, or may chose to see only the changes that are "compilable", etc.  Different subteams on the same project may only want to be exposed to code that "passes tests", while developers on other projects that depend upon the code in this project may only want "Q/A tested" or "released" versions.

Such a versioning model would allow maximal sharing of code to appropriate audiences, while ensuring that the desired level of code quality/readiness is made available.  Developers wouldn't have to "call over the cubical wall" to ask his fellow developers if they have committed a particular change yet.  A developer could choose to see even the most ill-prepared code and keep tabs on it as it progresses, or even jump in and help out with edits on code that has just been entered by a teammate.  This has interesting implications for (non-co-located) team members that want to review each other's work or even engage in remote pair-program activities.

But the fundamental idea here is to increase the granularity of "commits", such that developers do not keep their code inaccessible from others for too long, while at the same time, preventing low-quality code from being exposed to users that would be harmed by it.  This seems like a safer alternative to committing infrequently, as it allows developers to not worry about publicizing their work that may not be ready for prime time.  Such a version control system would become a combination of "local revision history", "distributed version control", and "public commits" (to a central server).  It would encourage  developers to be more explicit about publicly declaring the state that a particular revision of their code is at.  It would avoid the "ready" versus "not ready" binary proposition that is forced upon us when using the commit model of current version control systems.  And it would maximize visibility and sharing of code that is under active development.

No comments: