November 14, 2003
The Design of XMLBeans (Part 1)
I am a contributor to a new Apache-project-in-incubation called XMLBeans, a powerful new Java/XML binding tool.
Although folks are starting to use it, and there is some reasonable documentaion, nobody has explained the architecture of the tool, or explained the "why" behind the "how."
The Simplified XML Binding Style
The XML Schema/Java binding technology used by Apache XMLBeans is a carefully designed "simplified binding style" that has several desirable properties.
This note describes the underlying principles and the specifics of the simplified XML schema binding style.
The simplified binding style is built on two architectural principles:
These two principles provide a bedrock of invariants that guarantee that some basic programming mechanisms work. For example:
The two basic principles also have the advantage that they provide an easy and intuitive model for programmers to apply and understand. Yet preserving both principles while providing a useful binding model presents a couple challenges.
Understanding Type Correspondence
The principle of type correspondence provides a Java class for every schema type. In particular, all the built-in Schema types must have corresponding Java classes.
At first blush, one might assume, for example, that the Java class formally corresponding to the schema type xs:string should be java.lang.String. However, since java.lang.String is a final class, that choice would not allow xs:token (or any other schema type which inherits from xs:string) to have a Java class that has the proper inheritance relationship, since no Java class can extend java.lang.String.
On the other hand, any Java programmer would be right to demand the convenience of a java.lang.String for each xs:string, as well as a java "int" for an xs:int and so on, even though the "instanceof" operator has no hope of working correctly. Faithful type correspondence, while very important for complex types, seems to be different from what you want in practice for simple types. And yet, since schema allows complex types to inherit from simple types (these are called complex types with simple content), if we do not establish type correspondence for simple types, we will not be able to establish full type correspondence for complex types.
The solution provided by the simpified style is to provide not one, but two Java classes for each simple type. There is a "formal" Java class which establishes full type correspondence, and there is a "convenience" Java type that does not need to play in the type correspondence world. The "convenience" Java type does not need to uniquely map to or from a schema type or have any particular inheritance relationship with other Java types, and it will be provided where convenience is important. But the "formal" type will always be available and will represent the "true" data model.
A table of all the built-in schema types together with their "formal" and "convenience" Java types is listed below.
** sometimes - for the non-simple types - the "convenience" type is just the same as the "formal" type.
The formal classes have the same inheritance relationships that the corresponding schema types do, for example, the XmlInt Java class has the following base types:
The fact that the inheritance in Java follows the inheritance in schema has some utility. For example, if XmlDecimal has a method called "getBigDecimalValue()", then you can also call "getBigDecimalValue()" on any XmlInteger, XmlLong, or XmlInt. Even if somebody has substituted a restricted subclass in the XML instance such as an xs:int for an xs:decimal, the programmer can be assured that it is always possible to extract a BigDecimal value in the same way.
Another consequence of the type correspondence is that every Java class that corresponds to a schema type inherits from the class that represents xs:anyType. Here we have called this universal base class "XmlObject".
Of course, the principle of type correspondence extends beyond the builtin types above to all user-defined types. Note some user-defined types in XML Schema are anonymous. In the simplified binding model nested anonymous schema types also have a corresponding nested Java class.
I've talked a bit about what "type correspondence" means when
doing XML/Java binding. In the next article in this series I will discuss some of the details of "node correspondence" - what it is, and what it is not.
Posted by David at November 14, 2003 04:37 PM
|Copyright 2003 © David Bau. All Rights Reserved.|