December 01, 2003
Theory of Compatibility (Part 1)
You are presenting the design of your software's Version 2 to management. This is when the compatibility saga typically begins.
Y: In closing, 'Whizbang' will break V1 compat.
Maintaining full compatibility with an old version is a hard problem that any successful project eventually gets stuck with. But if you plan for it, it can become much easier. This article is the first in a series that will show how.
The Limits of Intuition
As practicing software engineers, we have have some intuitive ideas about compatibility, such as:
As long as you only add new methods to a public class and never remove any old ones, you tend to be OK. However, experience also tells us that our intuition is often wrong. A simple example - suppose we had an interface instead of a class:
Here is version 2:
The problem with "just adding methods" to an interface is that it will break any old classes which implement CustomerContact but which do not implement the new sendSMSMessage method. For full compatibility in the presence of both callers and implementors, you simply cannot change an interface at all. Our rule of thumb that "adding is OK" doesn't apply to interfaces. Is there a better rule that we can use?
This series of articles will apply a little simple set theory to build a rigorous foundation for compatibility of representations. We work with representations (such as XML messages) rather than encapsulations (such Java interfaces), because compatibility of representations turns out to be far easier to understand and far more powerful.
What is version compatibility?
Version compatibility is when we can introduce a new version of either a producer or consumer of messages without changing the other.
Here are two terms that we will not formally define yet.
Backward compatibility: this is what we want when a new version of a message consumer is introduced without changing the message producers. Informally, the consumer may have new features but should still be able to support all the old ones.
Forward compatibility: this is what we want when a new version of a message producer is introduced without changing the message consumers. Informally, the producer may have new features but should not add them in a way that confuses any old consumers.
A Basic Idea That Doesn't Work
How can forward- and backward- compatibility be designed into a network of systems? Let us begin by a simple, naive analysis of what happens when there is not necessarily any known contract between producer and consumer.
In this case, the set of messages that can possibly be produced in the current version's producer is just some set P0 that may not be precisely known to the designers of the consumer, and the set of messages that can correctly be consumed in the current version's consumer (whether or not anybody actually produces those particular messages) is a set C0 that may not be precisely known to the designers of the producer.
The fact that the network functions correctly is the same as saying:
Notice those sets do not need to be the same. Certainly a real-world client may have the capability of handling a message that a real-world producer is never capable of producing. Correct networks are built like this all the time.
Now, suppose that the exact set P0 is unknown to the designer of the message consumer, other than the fact the current network functions correctly, i.e., that P0 is a subset of C0. Then if introducing a new version of the message consumer, the designer knows that the new consumer must be able to consume a set of messages C1 where
Assuming the consumer designer knows about C0 but nothing about the particular P0 on the network, the only way backward compatibility can be ensured, given the fact that all possible P0 are only known to be a subset of C0, is for the designer to guarantee that:
In other words, every message understood by the old version of the consumer must be understood by the new version of the consumer as well. This is intuitively what we would expect of a "backward-compatible" consumer, and the definition that Emmerich and Dui use. Only add, never remove.
This kind of "backward compatibility" does miss the subtle possibility that a new version of a consumer might have a requirement to add a feature (say, a security feature) where a new consumer needs to reject messages that might have previously been acceptable, especially if those kinds of messages were only theoretically accepted by the previous consumer and never produced by real-world producers in the past. However, at the level of detail we have gone into so far, this doesn't seem too bothersome. Handling "all messages that used to be consumable and possibly some new ones" does seem workable as a backward-compatibility policy.
Is there a similar rule for a "forward-compatible" producer? Yes, but unlike the rule for "backward compatibility" it turns out to not be what we would want at all. If a designer of the producer is introducing a new version P1 of a producer without knowing the exact set C0 other than the fact that P0 is a subset of C0, then P1 must satisfy:
That implies that the designer must ensure that:
In other words, a new message producer is not allowed to produce any new messages that were not already produced by the old version of the message producer!
But as a practical matter, new versions of systems need to be able to produce new kinds of messages. What is going on? Have we proved that forward-compatibility is impossible? No. The assumption that has doomed this line of reasoning is that there is no shared knowledge between the producer and the consumer. What we need is an agreement. A shared contract.
Introducing a Contract
What kind of contract is necessary to make forward compatibility possible?
Let us call the 0th version of a contract between a producer and a consumer "S0" (think of "S" as standing for "schema"). At a bare minimum, the contract S0 must define:
consume(S0) - messages required to be consumable
where produce(S0) ⊆ consume(S0)
The last clause is a self-consistency rule: it would not make any sense for a contract to specify that producers can produce any messages that it doesn't require consumers to consume. Later you will see that whenever we introduce proposals for produce(S) and consume(S), we need to be careful to make sure we have established self-consistency.
In set notation, the contract S0 requires that for any real-world producer that produces P0 and real-world consumer that is capable of consuming C0,
consume(S0) ⊆ C0
These two requirements of the contract work together with self-consistency of the contract to guarantee that
In other words, the two rules of the contract ensure that the network functions correctly.
It seems like the contract S0 has done nothing but introduce two additional sets between the "real" P0 and C0 on the network. But as we shall see, introducing the contract gives us a mechanism that can be used to achieve meaningful forward compatibility!
Defining Forward and Backward Compatibility of Contracts
Let us define forward-compatibility and backward-compatibility of contracts as follows:
Definition: a contract S1 is backward-compatible with an (older) contract S0 if
In other words, the new contract is backward-compatible if it requires new consumers to accept any message permitted to producers of the old contract. Notice that following this definition will ensure, in particular, that old P0 is a subset of the new C1. It ensures that a network containing an old producer and a new consumer works correctly.
Definition: a contract S1 is forward-compatible with an (older) contract S0 if
In other words, the new contract is forward-compatible if it requires new producers to produce messages which are in the set that old consumers were required to consume.
Definition: a contract S1 is fully-compatible with an (older) contract S0 if it is both forward- and backward- compatible with S0. In other words, both:
produce(S1) ⊆ consume(S0)
In addition to these definitions, do not forget that we always have the self-consistency rule for a contract, which says that contracts must satisfy
produce(S1) ⊆ consume(S1)
Combining these four rules together gives us the perfectly equivalent statement that full-compatibility is the same as:
In words, contracts S0 and S1 are fully compatible if both kinds of consumers are capable of consuming messages produced by either kind of producer. The symmetry of the rule shows that our definition of full compatibility between two schemas doesn't make a distinction between an old schema and a new schema. If they are fully compatible, it means they can coexist regardless of which contract is newer.
It also suggests that all these definitions have not gotten us very far yet. However, the idea of a contract as a rule that specifies produce(S) and consume(S) is a very powerful abstraction as we are about to see.
Our Simplification Avoids Semantics
It is important to observe that while the contracts we are describing extend promises about messaging restrictions and capabilities, nothing in our simplified contracts requires any particular behavior for producers or consumers other than "correctness."
For example, there is nothing in our simple concept of a contract that requires producers to ever produce any specific message. We just say, if a producer is inspired to produce a message, it must limit itself to produce one that is within the set produce(S).
Similarly, there is nothing in our simple concept of a contract that requires consumers to ever do anything specific when receiving a message. We just require that consumers be able to receive, without crashing, any message within the set consume(S). A consumer may very well choose to ignore parts of the received message (or the whole thing). As long as there is no crash, that is perfectly fine.
Is this lack of semantic modeling a fatal weakness for our simple model? There are two reasons is is OK:
Our simple contract model describes a "representational" strategy for interoperability. Although this may apply less well to encapsulation problems such as compatibility of Java interfaces, this model works well in the world of XML, XML Schema, and web services.
An Extension of a Contract
So our definitions of contracts and compatibility are potentially useful. Yet the definitions so far leave something to be desired. In particular:
The needed concept is something we will call here an "extension" of a contract. As soon as we provide formal definitions for different kinds of extensions, we will be able to easily see that a extended contract provides excellent compatibility properties.
Definition. A producer extension of a contract is a new contract that does not prohibit producers from producing any messages they used to produce, i.e., S1 is a producer extension of S0 if
Notice that a producer extension is a relaxation of the old contract for producers that allows producers more freedom to produce more messages than allowed under the old contract. An extension is not a requirement that producers must be capable of sending new messages; none of our contracts require a producer to send any particular message - they just give permission for producers to send a message if they wish. So a producer extension gives more permission than before. In particular, any producer that used to implement the old contract is also a valid implementation of a producer extension contract.
Definition. A consumer extension of a contract is a new contract that does not require consumers to consume any additional messages beyond those they used to be required to consume, i.e., S1 is a consumer extension of S0 if
This definition is similarly easy for implementors. A consumer extension relaxes requirements on consumers by allowing them to reject messages that used to be in the "must-be-acceptable" set. It does not require consumers to reject new messages, nor does it require consumers to consume new messages; it just allows consumers to reject new messages if they wish to. Again, any consumer that used to implement an old contract will be a valid implemenation of a consumer extension contract.
Definition. A complete extension of a contract is a new contract that is both a producer and consumer extension, i.e., S1 is a complete extension of S0 if both
consume(S1) ⊆ consume(S0)
Since in practice, new versions of contracts typically define new rules for both producers and consumers, we are typically interested in complete extensions that extend both sides of the contract.
From here on out, we will abbreviate produce(S) by writing p(S) and consume(S) by writing c(S).
The Theorem of Compatible Extensions
When is an extension compatible with an original contract? Remarkably, the answer is "always" - producer extensions are always backward compatible, consumer extensions are always forward compatible, and complete extensions are always fully compatible. This is why contracts are so powerful.
Theorem of compatible extensions.
Proof of 1. The definition of a producer extension gives us
But then the self-consistency of contract S1 ensures that
Therefore we have
and in particular
which is to say, S1 is backward compatible with S0.
Proof of 2. Similarly, a consumer extension gives us
In other words, S1 is forward compatible with S0.
Proof of 3. Full compatibility of complete extensions simply follows from the fact that a complete extension provides both forward- and backward- compatibility.
Significance of the Theorem of Compatible Extensions
Perhaps counterintuitively, as long as a new contract permits producers to produce at least as many messages as before and requires consumers to accept no more messages than before, the new contract is fully compatible with the old contract. Simply by "relaxing" the old contract for both producers and consumers we have ensured full compatibility with it.
The theorem of compatible extensions tells us that, if we have a good way of extending a contract, full compatibility is hardly any work at all!
The author thinks this result, though very simple, is remarkable. Defining an extended contract as we have defined it should be a very easy exercise for real-world engineers - they just need to relax an existing contract for both producers and consumers. It is much easier than, for example, trying to run exhaustive tests against an unknown set of real-world producers or consumers. This incredible simplification is the core reason that programming to a contract is so powerful.
The subtlety is that for this theorem to work, all the contracts need to be self-consistent, i.e., they must specify a c(S) that is always a superset of p(S). We need, in other words, a contract language that makes it easy to specify extensions and provide consistent p(S) and c(S). Fortunately for us in the XML world, W3C XML Schema fits the bill, but we will need to lay down some more concepts to more easily discuss exactly how.
produce(s) versus consume(s)
What is the real meaning of the sets p(S) and c(S)?
First of all, we must acknowledge that it may appear unusual to think of a contract as defining two sets p(S) and c(S) rather than just one. Conventional computer science training leads us to think of a contract as a language that defines exactly one "set of valid messages". For example a regular expression or a BNF grammar defines one "set of valid strings". Why isn't a contract just seen as defining a valid set?
In fact, the set p(S) does correspond to the "set of valid messages". As the set all messages that the contract S allows producers to produce, p(S) delineates the the set of messages that are allowed to actually flow on the network. If S represents the contract for the whole network, we should never find any message outside of p(S) on the wire. If we did, it would indicate some producer was not conforming to the contract S - a good way of finding bugs!
In contrast, if S represents a uniform contract for the whole network, there is no physical place where we would ever be interested in finding messages in c(S). That is, the set c(S) is more of an abstract constraint on the capabilities of consumers rather than a physically interesting set of messages. Can we make do without the concept of c(S) in our contracts?
What the theorem of compatible extensions tells us is that c(S) becomes useful when more than one version of the contract coexists on the same network. The reason for the set c(S) is to "open up a gap" within which we can design new compatible contracts.
Contracts Must Open a Gap for Compatibility
Here is an explanatory picture. Suppose we had no c(S), and that contracts only defined a valid set p(S) which defined the valid vocabulary constraining both producers and consumers. Then a correctly engineered producer (producing messages P0) and consumer (capable of consuming messages C0) reads as:
In this picture, there is no room to "change" the valid set to any new p(S1) != p(S0), because if one did so, there would be some possible P0 or C0 that would not be forward or backward compatible implementations of the new contract. A single newly valid message m would break old consumers that were promised by S0 that they did not need to understand m. A single newly invalid message m would invalidate old producers that were promised by S0 that they were safe in producing m.
However, if a contract defines the set c(S) in addition to the set p(S), the picture changes so that we have
Now, as long as c(S0) != p(S0), there is a gap within which we can evolve the contract in the future.
What the theorem of compatible extensions tells us is that as long as we can find a new contract S1 that "wedges between" the p(S0) and c(S0) of the old contract, we can maintain full compatibility.
The path from P0 to c(S1) and from p(S1) to C0 is what ensures forward- and backward- compatibility respectively.
This process can be continued in subsequent versions S1, S2, S3, etc, as long as there is a gap between p(S) and c(S):
By wedging new contracts between the p(S) and c(S) of old contracts, we can maintain compatibility with the old contracts and therefore with all old consumers and producers.
Understanding the Problem with Java Interfaces
Consider the simplest possible Java interfaces, which have only methods that have "void" return values. Then an interface S defines:
c(S) = the set of methods that implementors must handle in S
In Java, the problem with interfaces is that
In other words, implementors of an interface are required to provide implementations only for those methods that callers are allowed to call in the current version of the contract, and no more.
This form of "tight coupling" between callers and implementors leaves no room for introducing a new version of an interface that is fully compatible with an old one. Once there are both legacy implementations and callers constraining compatibility, a Java interface cannot be changed at all.
One solution in Java is to use an abstract base class rather than an interface. By providing a mechanism to provide "default implementations" of new methods, base classes allow the set c(S) to be automatically expanded for new versions of contracts without changing subclasses. In effect, base classes give Java a way of saying, once you have written a subclass, your class is able to receive all future method calls that might possibly be defined in future contracts, and the behavior will be handled for you by the new definition of the contract. You can think of abstract base classes as a way of letting c(S1) of the future be larger than p(S0) today.
Of course, the practical problem with the Java visioning mechanisms is that they require new code for the base class to be deployed. A larger "c(S1) of the future" does you no good if you are stuck with deployments of your old code that are not going to change in the future. What we are working towards is a stronger compatibility framework that allows us make literally no changes to an old producer or consumer of messages, yet still achieve compatibility. This leads us to a more rigorous requirement: for our consumers, our set c(S) must not just "act as if" it is larger than p(S). It must be known and well-defined to be a specific set of messages that is larger than p(S) for forward compatibility to be possible.
Standards such as W3C XML Schema have provided powerful idioms that can be used to define valid sets of messages p(S) for a contract. However, little formal work has been done defining c(S). Yet we have seen that c(S) is needed to define a robust, versionable contract. How can we define c(S) in a useful way?
In the next article in this series, we will discuss one of the common techniques for defining c(S), and we will pick apart the compatibility scheme behind one of the most frequently-used protocols in the industry: HTTP headers.Posted by David at December 1, 2003 08:01 AM
|Copyright 2003 © David Bau. All Rights Reserved.|