Last month I attended a presentation by Roger Sessions at the local (Waltham, MA) Microsoft office’s monthly ISV Architect Council meeting.
Roger was basically summarzing the content of his November newsletter, which talks about next-generation Web services specifications. The newsletter is very interesting and Roger’s perspective is definitely worth having. But among many other (more useful things he calls transactions a “fatal flaw” in the WS-* specifications stack.
I don’t think this is true, and said so. Whether or not transactions are a good fit for Web services entirely depends upon what you’re trying to do. The classic two-phase commit model, which Roger says doesn’t work, is fine if that’s what you need. It’s just that the rest of the problem isn’t solved yet.
In fact he was very gracious and we had a good discussion about it both during and after his presentation.
I agree with Roger, however (and have often said so), that Web services are better suited for use in applications that benefit from asynchronous communications than for use in applications that need synchronous communications — and because distributed transactions typically rely on synchronous communications there is a definitely a problem. I talked about it with Doug Kaye about 18 months ago, and also gave a presentation on the topic at SIMC in New York just before that.
Since it represents the convergence of my two major technical interests, I have been trying to help come up with a good solution for Web services transaction processing for almost 5 years. The closest I’ve come so far is what Mark Little and I came up with for the WS-CAF Business Process transaction protocol (although WS-CAF has changed a bit since I wrote the article, the major concepts are the same).
The problem starts with the fact that the HTTP transport doesn’t include persistent sessions, which current distributed transaction processing solutions rely upon to share transaction context and reliably transmit protocol primitives (i.e. “prepare,” “commit,” and “rollback”).
Just after SOAP came out, Don Box and I thought we could easily knock off the transactions spec by mapping the Transaction Internet Protocol (TIP) from IETF (which I’d helped write) to SOAP (which Don had helped write). However we quickly discovered that we had a big problem since TIP, like most other distributed 2PC protocols, requires a persistent communication session for sharing the context and transmitting the transaction protocol primitives.
Because the loss of a communication session automatically signals a rollback (yes, even in OTS/JTS, although there was certainly a big argument about it, given CORBA’s ability to transparently retry), and lost sessions are a common occurance (you could even say part of the design) in HTTP, the result of mapping TIP to SOAP over HTTP wouldn’t be very useful. So one of the first things you need to do is solve the context sharing problem (as in WS-Context).
Any potential solution is complicated by the fact that it’s impossible to reinvent or replace existing TP systems, some of which have been in production for 30 years or more. OTS/JTS for example represents a significant compromise between the goals of object oriented technology and transparent distribution on the one hand and the capabilities of existing TP infrastructure such as database management systems, transaction managers, and TP monitors on the other. When developing a new transaction protocol you just cannot assume that you are going to rewrite Oracle, or CICS, Tandem NonStop, or any other existing system.
The usefulness of WS-AtomicTransaction (and perhaps even more so of the ACID protocol in WS-CAF) is that it helps map Web services to existing transactional systems. Roger Sessions would say that you shouldn’t do this since Web services (like the Web) are designed for stateless applications with potentially long latencies over asynchronous interaction paradigms. But Web services do not actually prevent the synchronous usage pattern, and for some applications it is a good fit.
If you are using Web services for a transactional RPC across J2EE and .NET and you know that your database locks are not going to be held open too long, WS-AT is probably exactly what you need.
The WS-BusinessActivity spec (check IBM’s DeveloperWorks site for good information about the IBM/Microsoft specifications) works using compensations. It defines an “open nested transaction” model that allows participants to commit independently, and then if they need to roll back, execute a compensatory action. This is what Roger means (if you have read his newsletter) by having to create a lot of code – developers have to create the compensation program.
However, compensation is no more of a universal solution than two-phase commit. Some operations just can’t be easily compensated – firing a missle, printing a boarding pass, shipping an order, etc. It is easy enough to cancel a credit card purchase if you’ve just made it, but much harder after it’s been processed for a month or longer. And TP systems can easily get completely lost when trying to compensate for a failure in a compensatory action (recursive compensation really isn’t feasible).
The germ of the solution can be found in the role of the coordinator, however. (And Roger does highlight the overall usefulness of this part of the proposal.) The big idea in all of these specifications — WS-AT, WS-BA, WS-C and the WS-CAF family, and the “Additional Structuring Mechanism for OTS” — is that the transaction coordinator can be factored out and used independently of the protocol. So the WS-AT, WS-BA, ACID, LRA, and BP protocols are all defined as plug-ins to a generic coordinator.
The BP protocol in WS-CAF extends this idea one logical step further. The idea is derived from the transaction bridge we designed when I joined IONA about five years ago. The bridge uses an interposed coordinator — originally added to the OTS spec to improve network effeciency — to bridge variations of the 2PC protocol. Microsoft/DTC, Oracle/XA, and Mainframe/RRMS all support slightly different variations of the 2PC protocol primitives. As long you can delpoy an interposed coordinator on a given platform to improve network efficiency, why not also make it responsible for resolving the variations in 2PC? Send the standard 2PC primitives defined in the back of the OTS spec over the wire, and map them to the DTC and RRMS primitives on those platforms.
And if an interposed coordinator can do that, why can’t it also bridge multiple types of protocols? Synchronous to asynchronous, for example? Each coordinator takes responsibility for a given transactional “domain,” communicating asynchronously with the other coordinators in the system to drive composite applications to a common outcome. As far as I’ve been able to determine, this should work. There’s a question about performance, but Mark and I (and others) have worked through the model sufficiently to at least validate its theoretical basis.