(Summary from
forum and written as an article). Latest version
here.
Background
When deploying large networks for web services, particularly when many of the installations are out of our control, we need to be able to cope with services that are older than others. We also need to deal with older clients using newer services; this includes older services interacting with newer ones.
Existing work
This is not a new problem.
Common Solution - 'Loosen up'
The general solution seems to be to 'loosely couple' the service interfaces and carry out the versioning transformations behind. At the most extreme, this consists of a method
doIt (
XmlDocument ) . This rather defeats the point of having a defined, contractual web interface, and we cannot be sure that the service is going to cope with our input correctly. Indeed we don't even know what input it requires until trying it out or a human reads some documents.
Resiliant Classes
A
MSN-TV video by a chap called Doug Purdy was cited as being a good view on versioning:
Basically a set of WSDL-generated classes for a particular object (eg Personv1, Personv2) requires an associated 'resiliant' superclass (Person) that holds version and unknown extra information for classes that it knows nothing about. Decisions about what to do with a WSDL-generated class when it arrives can be based on the version information retrieved from the superclass.
However; this makes the resiliant class (Person) quite heavyweight as it needs to be able to interpret any and all WSDL-generated classes. It has to make guesses about WSDL-generated classes that it knows nothing about. This all might get particularly difficult in the case of changes rather than additions; in the example given, what happens if Name is split to Forname and Surname? We seem to be getting back to lots of untyped XML elements again.
(An interesting philosophical point was that he considered namespaces inappropriate; that a new namespace defined a new types. Which seems correct - but then I do consider different versions as indeed being different types, and trying to use one class to represent a variety of versions can make it very cumbersome)
Some notes on Good Practice, SOAPy Beans.
We have been generating our Java beans from the WSDL (therefore '
SOAPy Beans'*) rather than the other way around. This means we can create WSDL that is suitable for the outside world to use in general, rather than Java-oriented WSDL. It also means that our code breaks if the interface changes, rather than the other way around - and broken code is obvious at generate/compile time rather than much later at runtime when someone actually uses that interface.
WSDL -> Java is also the only direction when implementing 3rd party interfaces; our datacenters will need to implement the Sky-Node interface, and this is likely to be defined using WSDL.
However there is the build/coding overhead of producing the beans, the runtime overhead of marshalling / unmarshalling the beans, and we have to handle conversions between several versions of beans. Dealing with messages that change in minor ways can be easier using DOM - but also risky as
getElementByTagName() won't break at compile time if, for example, the tag disappears.
Not all WSDL-defined message components have to be blindly represented as SOAPy beans; for example the query to a datacenter is never likely to be manipulated using Java, and so need not be marshalled.
* Which are generally only data-holding beans. Thanks to NoelWinstanley for coining the term, and both him and PaulHarrison for persuading us to use them!
SOAPy Beans and Real Beans
Using SOAPy Beans as real-world beans is often not practical. Real world classes have enumerations, calculated fields, and methods that carry out tasks. They may have properties that should not be publically settable; indeed the whole object may be immutable.
Dependencies and versioning problems
Any code that uses a SOAPy bean becomes entirely dependent on that particular web interface - the mechanism we happen to be using to publish the application at that particular time. Of course, this is not important for short-term web services, but will be for large systems with other interfaces (such as grid, or other web services) that have their own messages and therefore their own generated beans.
We can wrap our SOAPy Beans with Real World classes, and/or we can build constructors for our Real World classes that accept a SOAPy bean to construct from. However both of these still leave us with code that is dependent on our publishing mechanism, and we have to create a constructor to take each SOAPy Beans as the interface changes.
Proposal
Why Change?
The general assumption is that a web service interface changes; instead I suggest you preserve and deprecate the existing interface and add a new one. This puts the work onto the service components to manage versioning, not the client. An old client can continue to use new services through the service's preserved old interface. New clients can include the old client code to use old services.
Layer your interfaces
Axis generates an interface for you to implement around the binding. If you're doing web services more directly, then you'll presumably have a class to handle the SOAP messaging, seperate from the business logic. This is the place to convert between your SOAPy Beans and Real world classes.
For example the datacenter might have an
AxisDataServer-v4 which converts between real-world objects and the generated beans for that particular web interface. Later, we might have a new interface which we can call
AxisDataServer-v5, for which we will add conversions from all the new v5 generated beans. We may need to adapt the v4 layer if the business logic has changed (indeed it may be driving the change).
If you are providing client libraries, the same applies; as you add new interfaces, leave in the code for the old one and add a converter from a consistent API to the binding layers.
The generated beans are now contained along with the binding and the code that interprets that particular web interface.
This means:
- It is straightforward to add a new interface - a new package is required for the new bindings, with the generated beans and the interpreting code.
- Minor version changes require a whole new interface to maintain (if the old one is to be preserved). Packages can share common code, but beware entangled interfaces.
- It is straightforward to identify which versions are supported by a web service, and the access point for each one. Software clients can check automatically to see what is available and adjust appropriately.
- It is easy to throw away old versions. When you are sure you have no clients using a particular version (or you are no longer willing to support them!) you can just remove the packages for that version. There is no impact on the business code. Long term maintenance is easier; there is no need to maintain old generated classes deep in the business logic, or spend high-risk effort removing it.
Comments
Please post comments on the wikki at
http://forum.astrogrid.org/read.php?TID=712
--
MartinHill - 08 Mar 2004