davidbau.com Theory of Compatibility (Part 2)

Theory of Compatibility (Part 2)

When the U.S. Bureau of Engraving lanched the new $20 bill (the new rainbow-colored bill, announced with a bit too much fanfare), they had two main messages for the media. The first was that the new colored bills can be used wherever you used the old bills - they are backward compatibile. The second message was you don't need to go to the bank to turn in your old bills, because they are still sound - they are forward compatible.

"Both the new notes and the older-design notes will continue to be legal currency at full face value. The U.S. has never devalued its currency and will not do so now." - U.S. Bureau of Engraving

Why doesn't the U.S. just pass a law requiring everybody to turn in the old greenback to switch to the hard-to-counterfeit rainbow money? Because the promise of both backward and forward compatibility of currency is one of the bedrock foundations of a modern economy. The committment of permanent compatibility is itself a source of great value.

The Value and Challenge of Forward Compatibility

The same is true for software. Both forward and backward compatibility are promises that carry a lot of value. The whole Microsoft software franchise is built on a promise of backward compatibility - Windows XP can still run old DOS programs and print to old DOS printers, and any future versions of Windows can be expected to continue to do so. Microsoft has long recognized that the value of the compatibility promise outweighs other disadvantages such as security holes.

But in software, forward compatibility is a more difficult promise to make than backward compatibility. To build forward compatibility into a system, you need to make sure it will work with future versions.

How on earth can you know what the future will bring? There is no way 1981 MS-DOS could be designed to run 2003 Windows XP programs.

It's tough to make predictions, especially about the future. - Yogi Berra

In the previous article in this series, we showed that it is actually possible to build a solid framework for full compatibility of representations like XML. This is a remarkable result. Of course you can't expect an old system to start behaving in totally new ways when it starts getting new kinds of messages. But if the protocol is designed for forward-compatibility, you can send a new message to an old system, and the old system will continue to work. While the last article described in broad theoretical terms how this can be possible, it didn't propose any concrete designs.

How do you actually build full compatibility in practice? It has been done. The trick is, instead of predicting the future, you must make careful committments today that you will be able to keep tomorrow. One protocol that does this is HTTP, which includes forward-compatibility committments about extensible HTTP headers.

HTTP headers

Like many other internet standards, HTTP headers are based on the old text mail headers in RFC 822. (HTTP itself is defined in RFC 1945 (HTTP/1.0) and RFC 2616 (HTTP/1.1).) Here are the headers at the beginning of an example HTTP response message:

Date: Thu, 15 Aug 1996 12:30:18 GMT Server: Apache/1.1.1 Content-type: text/html Content-length: 2488 Last-modified: Fri, 09 Aug 1996 01:13:39 GMT

The headers are parsed the way you would expect. Each "field" has a name preceding the first colon on the line, so "Date" is the name of the first field here.

The HTTP specifications provide special-case handling for a small fixed set of "general" headers (such as Date) and special "request" and "response" headers (such as Server). Then any remaining headers are defined as "entity" headers and a core set of well-known entity headers is defined. In HTTP/1.0, the well-known entity headers are:

Swk = {Allow, Content-Encoding, Content-Length, Content-Type, Expires, Last-Modified}

The interesting thing about the set of HTTP entity headers is that it is extensible - users of the HTTP spec are free to design more specialized implementations of HTTP that introduce new HTTP headers. And even more interesting, HTTP allows extensibility in a forward-compatible way. A recipient who receives a message with some new messages must continue to work.

The Two Compatibility Rules

How does HTTP guarantee forward compatibility? There are two key ingredients:

The must-ignore rule. "Unrecognized header fields should be ignored by the recipient and forwarded by proxies."
The recognizability-cannot-be-assumed rule. "The extension-header mechanism allows additional Entity-Header fields to be defined without changing the protocol, but these fields cannot be assumed to be recognizable by the recipient."

The general idea is that the first rule promises that recipients don't fool with future headers that they do not understand. They are required to ignore them. This requirement "reserves space" for future headers to safely use without fear of rejection or interference.

The second rule requires that new contracts do not define headers in a way that requires recipients to understand them. This requirement is not a restriction on direct implementors of the basic contract, but a promise extended about future extension headers that are defined. Headers must be defined in a way such that it is safe to ignore them.

Formally Analyzing HTTP Compatibility

There is a formal correctness that underlies the two intuitive rules for HTTP forward compatibility. To see this, we return to our simplified theory of compatibility, using set theory terminology introduced in the last article.

Let us say an HTTP header specification is defined by both a name and a rule for which values are allowed to appear in the given header. For example, the specification of the "Date" header might permit "Thu, 15 Aug 1996 12:30:18 GMT" and yet not permit the string "yesterday" as a value.

Let us denote an HTTP contract by a set S of "understood" header specifications. Swk is the built-in contract defining the well-known entity headers, but users are free to define their own specializations S1, as long as

Swk ⊆ S1

If you are a message producer and you "understand" the set of headers S, then the set of messages you can possibly be allowed to produce (you need not produce all of them of course) is a set p(S) of messages m where:

The headers in the message m are all in the recognized set S.
The header values are also valid according to the header rules in S.

What about extensible headers? Can't we imagine message producers who should be allowed to produce extensible headers that are not in the recognized set, maybe by forwarding them from somewhere? No; in our simplified model, a message producer is the first originator of the message, not a forwarding proxy. It is never valid for an originator to use a header that they do not understand.

Must-Ignore is About Projection

The must-ignore rule is a constraint on message consumers who recognize a set of headers S. It says they must effectively apply the following projection function on every message before processing it:

proj(S,m) = message m with headers outside S removed

This means that recipients must accept a set of messages c(S) which is defined as follows:

[1] c(S) consists of all messages m such that proj(S,m) in p(S)

In other words, the must-ignore rule tells us that recipients must actually be able to consume a lot of messages: recipients must consume any message which, once the headers are removed, appears to be a valid message.

Thus the must-ignore rule is really a way of saying that c(S) is a preimage of p(S) under a projection function.

Our projection function has some interesting properties that will come in handy. For example, it is commutative for different contracts S. In other words, if you ignore unrecognized headers for both S0 and S1, it doesn't matter what order you do it:

[2] proj(S0,proj(S1,m)) = proj(S1,proj(S0,m))

Also, if you have a new S1 which supersets an old S0, the new projection ignores less information.

[3] if S0 ⊆ S1, proj(S1,proj(S0,m)) = proj(S0,m)

That is to say, no headers are removed by the new projection that wouldn't have already been removed by the old projection.

Consistency of Projection

Our projection has another important property, which is that it does not change messages whose headers are drawn from the set S:

if headers in m are all in S, then proj(S,m) = m

The reason this property is particularly interesting is that it guarantees that our contract is consistent. Recall that a contract is consistent if it provides for a c(S) that satisfies:

p(S) ⊆ c(S)

Since we know that any message in our set of valid messages p(S) contains headers drawn only from the set S, we know that

[4] for m in p(S), proj(S,m) = m

This fact is a way of saying that p(S) is contained in its own preimage under projection, so we know that p(S) is contained in c(S).

Since [4] is a basic property of the projection that ensures consistency of the contract, we call the property [4] consistency of the projection function.

Recognizability-Cannot-Be-Assumed Restricts Future Contracts

What is the recognizability-cannot-be-assumed rule? It is not a constraint on producers or consumers of a single contract, but a constraint on the form of future contracts. Here is the precise text from the HTTP specification:

The extension-header mechanism allows additional Entity-Header fields to be defined without changing the protocol, but these fields cannot be assumed to be recognizable by the recipient.

When a new contract S1 specializes an old contract S0, the rule requires that any new valid messages be valid according to the old contract even if the new headers are ignored.

In set theoretic terms, this means:

[5] if S1 specializes S0, m in p(S1) implies that proj(S0,m) in p(S0)

As a concrete example, it would be illegal to introduce a header called "Date-can-be-relative" which, if present, would permit the "Date" header to contains values such as "yesterday". The reason is that using this kind of header would assume recognizability. You cannot ignore the Date-can-be-relative header and hope to understand a Date header in the same message. In contrast, a "Relative-date: tomorrow" header which was defined to be in addition to, and not modify, the meaning of the original Date, would be perfectly fine.

Proving Forward Compatibility of HTTP

What do these facts together give us? They give us forward-compatibility with HTTP extension headers.

Proof of the forward-compatibility of HTTP.

Suppose we have an old set of header definitions S0 and a new set of header definitions S1 which is a superset of S0. Then take any message m required to be consumable in the new contract c(S1).

m in c(S1)

The must-ignore rule [1] tells us that projecting the unrecognized headers out of the message results in a valid message that is in p(S1).

proj(S1,m) is in p(S1)

Then the recognizability-cannot-be-assumed rule [5] tells us that all messages in p(S1) can be projected, using proj(S0,m), into p(S0):

proj(S0,proj(S1,m)) in p(S0)

We can apply commutativity [2] of our projection:

proj(S1,proj(S0,m)) = proj(S0,proj(S1,m)) in p(S0)

As well as the fact that a superset projection ignores less information [3]:

proj(S0,m) = proj(S1,proj(S0,m)) in p(S0)

And finally we apply the must-ignore rule again to conclude that the message used to be consumable in the old contract.

m in c(S0)

In other words, every message that is required to be consumable in the new S1 was required to be consumable in the old S0:

c(S1) ⊆ c(S0)

In the the last article, we gave a name to this property: we said that S1 is said to be a consumer extension of S0, and we proved that if the contracts are consistent (as we know for HTTP from [4]), then all consumer extensions are forward-compatible.

(QED)

So the the must-ignore and recognizability-cannot-be-assumed rules of HTTP work together to strictly guarantee forward compatibility, by guaranteeing that all messages that might be required to be consumable by future extensions are already required to be consumable in the basic HTTP contract.

Defining The Goal: Versioned Contract Languages

We will want to apply the same technique to XML. So let's generalize our formal framework.

Defintion. Define a "versioned contract language" as a language L of contracts, describing subsets of a universe of messages M along with

a permitted message set function "p" (L->2^M)
a projection function "proj" (LxM->M)
and a specialization relation "spec" (LxL->{true,false})

Which satisfy the following assertions whenever the specialization relation spec(S0,S1) holds (i.e., whenever S1 is said to specialize S0):

proj(S0,p(S1)) ⊆ p(S0) and
for all m, proj(S0,proj(S1,m)) = proj(S0,m)
p(S0) ⊆ p(S1)

And where a contract S in L requires producers to produce only messages within the set p(S), and consumers to consume all messages m such that proj(S,m) in p(S) consistenly, i.e.,

c(S) = {m such that proj(S,m) in p(S)}
for all m in p(S), proj(S,m) = m

Notice that the HTTP rules for extensibility headers for a "versioned contract languages" for sets of extensible header definitions.

Theorem of Compatible Specialization

Let's prove that a versioned contract language gives us a way to establish full compatibility between two contracts.

Theorem of Compatible Specialization.

Suppose we have a versioned contract language. Then whenever contract S1 specializes contract S0 (i.e., spec(S0,S1) holds), then S1 is fully compatible with S0.

Proof:

Forward compatibility follows from a proof essentially identical to the proof used to establish forward-compatibility of HTTP.

First, observe that the contracts in a versioned contract language are consistent. This follows from the fact that proj(S,p(S)) ⊆ p(S). So to establish full-compatibility between two contracts, we just need to establish that the new contract is both a consumer extension and a producer extension of the old one.

Suppose we have a new contract S1 that specializes an old contract S0. Let us establish that it is a consumer extension first.

Take any message m required to be consumable (in other words, m in c(S1)) in the new contract S1. The must-ignore definition of consumable messages in a versioned contract language tells us that:

proj(S1,m) is in p(S1)

Then the recognizability-cannot-be-assumed fact tells us that all messages in p(S1) can be projected, using proj(S0,m), into p(S0):

proj(S0,proj(S1,m)) in p(S0)

Finally, we apply the composition properties [2] of our projection to establish that proj(S0,m) is in p(S0):

proj(S0,m) = proj(S1,proj(S0,m)) in p(S0)

So therefore by definition of consumable messages, m must also be consumable in S0, i.e., m in c(S0). This establishes that S1 is a consumer extension of S0, and together with the consistency of both S1 and S0, this establishes that S1 is forward-compatible with S0.

Backward compatibility follows directly from the fact [3] that

p(S0) ⊆ p(S1)

And with both forward- and backward- compatibility, we have full compatibility between S1 and S0.

(QED)

The Power of Versioned Contract Languages

Why do we care about all this?

Because a versioned contract language helps us build a permanent promise about the full future compatibility of a contract. By doing all the hard work once in establishing the necessary conditions for future compatibility, we can extend promises of full compatibility by following simple rules.

A versioned contract language provides us with

A validity function that producers can follow to ensure they produce only "valid' messages for a contract.
A message projection function our consumers can use to "ignore" unrecognized information for a contract.
A specialization relation our future contract designers can use to test new contracts for compatibility.

If we follow the rules given by these three functions, we will be able to keep our compatibility promises.

It would be as if the Federal Reserve had a magic eight-ball that could guarantee that "if monetary policy follows these rules, then you will never need to devalue your currency". Versioned contract languages give us the power of a permanent federal-style guarantee of the future compatibility of a messaging contract, but they do it on the cheap, without a trillion-dollar central banking system (or central information architecture clearinghouse), and without the cost of keeping an Alan Greenspan (or his matching programmer genius) on the payroll to analyze every policy decision by hand.

And what we have shown in this article is that, for messaging contracts, we can do it. All we need to do is to make sure we do all our work within a mathematically proven versioned contract language. And then anybody who follows its validitity, projection, and specialization functions when designing contracts, consumers, and producers of messages will be guaranteed full compatibility.

The only thing is, we need is a good versioned contract language for XML. Fortunately, this can be done, and even more fortunately, it can be done within the scope of the W3C XML Schema specification. This is the topic we will begin to cover in the next article in the series.

Posted by David at December 8, 2003 10:22 AM

Interestingly most European currencies (Franc, Lira, etc) were taken out of circulation completely one day and replaced with the Euro. I was a little surprised they didnt do this with the new dollar because one of the key reason is to combat counterfeit money: if the old still works, the old counterfeit still works.

Posted by: Benjamin Renaud at December 8, 2003 08:41 PM

Yes, the huge effort taken in the euro transition and the refusal to eliminate the green dollar even in the face of counterfeiting are very powerful testaments to the extremely high value of compatibility!

Yet Europe broke compatibility when introducing the Euro. How did they justify the massive cost? Because it was offset by the benefits of introducing a different form of compatibility (compatibility between currencies of different nations).

When it comes to forms of exchange, perhaps the only thing that can be more valuable than compatibility is "better compatibility!"

Posted by: David Bau at December 17, 2003 11:38 AM

Posted by: kostya at August 13, 2007 10:13 AM

davidbau.com

December 08, 2003

Theory of Compatibility (Part 2)