November 21, 2003
No Theory of Compatibility?
Your software should be compatible with its previous versions.
Was that a cold chill that just ran down your spine?
The words "compatibility with old versions" strikes fear into the hearts of programmers. The story usually begins like this. After months of hard work, a team of programmers builds the next greatest gadget, complete with the latest gizmos and conformant to the newest buzzwords. And then management blesses the shiny contraption as a successful human achievement, as close to perfection as could be possible. And they hurry it - just in the nick of time - into the hands of carefully groomed customers in exchange for a bite of their wallets. Throw a big party, break out the champange, hurrah.
The Morning After
The next morning, the team gets together in a room and says "what next?" Inevitably, a bright young engineer will say "We really did 'waka' wrong in version 1. I've been thinking, let's change waka and do it right in Version 2." And then a senior engineer will inject, "But touching waka would break compatibility with our existing installations of Version 1 - we can't just change waka like that." What to do? On one hand, you cannot stand in the way of progress! Waka must go! On the other hand, you can't strand your paying customers - they're the ones with the money after all, and by the way, what a great party that was.
The problem is, it can be devilishly hard to build a new system to be fully compatible with an old system unless the old system has been very rigorously defined with foward-compatiblity in mind. And, unfortunately, programmers don't seem to typically think rigorously about forward-compatibility in the same way that they can think rigorously about program correctness, performance, or other aspects of programming. Forward-compatibility is usually not built in. Why?
A Call for a Theory Of Compatible Versions
I poked around a bit, and I found that, despite some prior literature on "theories of versioning" and "theories of compatiblity," there is not, as far as I have been able to find, a theory which provides the necessary tools for rigorously describing real-world strategies for achieving compatibility with future versions of a system.
Programming for compatibility is one of the great problems facing real software engineers. Is it really that hard? I imagine that before the theory of regular expressions, grammars, and so on, the problem of designing automated parsers seemed devilishly hard. Today, with yacc, lex, perl regex and so on, text parsing is easy and routine. Perhaps the reason version compatibility is perceived as a frightening problem is that nobody has had the patience to sit down and formally analyze how easy it can be!
Here are a couple references that I found that seem to come close addressing a theory of compatible versions, but neither hits the mark in my opinion:
- Dui and Emmerich do an admirable job at describing XML backward compatibility, but they do not address the problem of forward compatilibity.
- Conradi and Westfechtel provide an excellent survey of methodologies for managing many versions of software artifacts, but do not address a framework for achieving compatibility between versions.
My question, dear readers, is this: Are you aware of any clear and formal analysis that answers the question, "how does one achieve compatibility between Version 1 and Version 2?"
If not, in this space, we can write down the theory.
Posted by David at November 21, 2003 12:18 PM
Excellent article! I am currently working on an app that heavily uses XML schemas. Every new version of the app necessitates changes to the schema to reflect new desired features.
Schemas are great as they enforce type correctness, but this static property works against ease of changes. Classical tradeoff between static type safety and dynamic runtime binding.
New elements/attribtues can be added without breaking the schema only if they are made to be optional. This is conceptually correct if these features should truly be optional - if they should be mandatory, making them so in the schema will break current XML instances. Client would have to run a conversion tool to update their XML instance files. This can truly become a deployment nightmare - much easier to leave features optional.
Deleting a feature will definitely break current client apps. Therefore, if a feature is no longer needed, I chose to simply ignore it. Of course, as new versions are released, and more features deprecated, a mess results. No silver bullets sans forced conversion updates which will make clients unhappy. Classical dilemma between keeping external client contract happy vs. internal code/programmer coherence and happiness.
Versioning and identity have been around for ages - there's nothing really new here. Just look at all the deprecated legacy inconsistent methods and classes in Java - Sun could never remove them without breaking lots of code. Witness current Java jar version problems with EJB classloaders and well-known DLL hell. Ages ago IBM's SOM had a policy of allowing new features to be added, but disallowing deleting features. COM addressed the problem with interfaces being accessed by GUIDs.
See a recent discussion of this theme "Contracts and Interoperability" at Artima by Anders Hejlsberg at: http://www.artima.com/intv/interop.html. Quote:
"Versioning is all about relaxing the rules in the right way and introducing leeway. The absolute answer, the only way guaranteed to not break anything, is to change nothing. It is therefore important to support side-by-side execution of multiple versions of the same functionality, on the same box or even in the same process."
I agree with David that a reasoned exposition of this theme is in order and I look forward to reading the Dui/Emmerich. But I don't expect any final resolution to an inherently unsolvable problem.
Andre, thanks for the link to the Hejlsberg article.
I think we can do better here! XML Schema versioning is not yet a well-understood problem, but Schema is a good standard and it doesn't require rocket science to figure out how versioning should work.
There is a very specific observation that makes it all possible, even easy. Watch this space for future articles on the topic.
When Hejlsberg says "The absolute answer, the only way guaranteed to not break anything, is to change nothing", he is being overly pessimistic.
Andre is right that, if there is no contract outside the actual operation of the network, nothing can be changed.
However, if parties on a network agree to conform to a shared contract, it is possible to introduce changes (new versions of the contract) in a way that is guaranteed not to break anything.
I am beginning a series of articles on the topic on this weblog. The first is at:
I think David's article demonstrates that lack of forward planning for compatibility affects software but one must have a model to go by in order to do so. As David points out we have only come close in that frame. I think it can be done. I enjoyed this well written article.