Gregor Hohpe’s JavaZone presentation on Infoq

I have a lot of blogging topics to catch up on, but I just watched Gregor Hohpe’s presentation on Infoq and wanted to write about it while it’s fresh in my mind.
I really thought it hit home on a lot of very important points about SOA, especially things that developers (as well as the industry) need to think about.
I also would like to take him to task on a few items, including the fact that Starbucks does not need two-phase commit (I think this propagates a misunderstanding about transactions rather than illustrating how best to use them), but more about that later.
I really like Gregor’s book, and was in the same room with him once (the 2005 Microsoft MVP summit, since he and I are both Archiecture MVPs) but failed to introduce myself, something I still regret. He has a good, clear way of understanding integration issues and recommending how to deal with them. He does a great job with the presentation, clear, not too serious, and with a great perspective that acknowledges why things are the way they are even when saying that they are not exactly right.
I would definitely encourage everyone interested in development issues around SOA to sit through this presentation. It’s only about an hour – a little less – but Gregor covers a lot of very good territory, and raises a lot of good issues. Although I have to say that Gregor’s focus is on integration, while the main reason for SOA is reuse (not that these are necessarily in conflct, it’s probably more a matter of emphasis).
Ok, back to the Starbucks thing. Of course Starbucks doesn’t use two-phase commit – that would be like hammering a nail with a screwdriver – wrong tool for the job in other words. I completely understand the good advice not to use 2PC for this kind of scenario – what mystifies me is that someone would think it’s a good idea in the first place.
Transactions exist to coordinate multiple operations on data in order to record a real world transaction. No data operations, no transactions. Many argue that a retrieve operation is not a transaction, since the data isn’t changed.
And recording a real world transaction almost always involves multiple operations on data that need to be treated as a unit. Take the Starbucks example – the store has to record the payment and the fact that the drink was delivered. If either the payment isn’t given or the drink isn’t delivered the transaction didn’t happen and the record of it needs to be erased.
Computers being touchy electrical devices (as opposed to say, pen and paper), they can easily (and predictably, according to Murphy and everyone else) fail between data operations, and transactions are what ensures that the system of record (i.e. the database) matches the real world (i.e. if part of the real world transaction doesn’t complete, the database transaction rolls back one of the changes).
Starbucks no doubt uses some kind of database to record real world transactions. Therefore they no doubt also use software transactions, since it’s very hard to buy a database these days that does not support transactions, and even if you could you wouldn’t want to (personally I remember coding my own locking and doing manual recovery on an old HP Image database circa 1980, pre-transactions, but that’s another story).
Two phase commit (aka distributed transactions) is intended to be used when data operations involving multiple databases (aka transactional resources) need to be coordinated. I know that Gregor knows this, and is trying to illustrate the point that you should not automatically assume that you should use 2PC even if you can, but I for one think he could come up with a better analogy. In particular it would be helpful to illustrate when 2PC is worth the cost, and not just say “never use it.” In talking about most of the other topics he does usually get into a discussion about tradeoffs and when to use what, but in this case the advice is too simplistic for my taste.
Update 5/21 – this seems too long so I’m splitting it here…


Another bone to pick is over WS-CDL. Although it is certainly possible to compare and contrast this with WS-BPEL, and it is true (as he implies) that WS-CDL proponents have a different point of view than WS-BPEL proponents, almost no one implements WS-CDL. It seems more like one of those “good in theory” things.
One final complaint – during the tools discussion Gregor clearly describes the need for SOA oriented developer tooling, and says “Where is the Eclipse or IntelliJ of SOA?” What about the Eclipse STP project? Why doesn’t he mention that? Ok,it’s not done yet, but help is definitely on the way!
The STP project also includes SCA metadata for composing services, something he alludes to in saying that service composition requires another language.
Ok now for some highlights: I liked very much what he said about the need for a “refactoring BPEL” editor to make it easier to change flows around.
I also very much liked the points about transparent distribution, or rather its impossibility. I had just last week sent a requirement to the OSGi enterprise expert group email list (or maybe it is a non-requirement) saying that because transparent distribution is not possible in practice, we should clearly state a requirement that we are not going to attempt it. I guess the way I put it was more like: we should be explicit about the distribution points we define, and ensure that developers are consciously creating distributed services somehow.
(We are still at the requirements definition phase of this activity, but even so there was a lively discussion, as you might imagine, about what this means – but I do not think there was much pushback on the idea that we should not be attempting to meet requirements for transparent distribution.) In particular I liked Gregor’s statement about a “95% solution” being worse than none since it creates an illusion of success that creates even more difficulty when it falls apart.
I am not really sure that service orientation exists simply because it’s a technological advance over distrbuted objects. I’ve tried to capture my thoughts on this in some blog entries (most recent here), around the fact that we have (more significantly, I think) reached a turning point in the software industry in which we are looking back over what was done during the past 30-40 years and trying to understand what worked and what didn’t, and how to improve what we’ve done, rather than invent a bunch of new fundamental things like general purpose languages, operating systems, distributed object systems, etc. I would also say the trend toward domain specific languages (which I agree with Gregor is very interesting) is another example of this phenomenon.
I also liked what he said about top-down vs bottom up development, and how it’s really not in the spirit of SOA to create an enterprise-wide design entirely from the top down. (BTW check out the great EAI diagram he shows when talking about how coding in pictures doesn’t work.)
I have no doubt that successful SOA projects represent significant challenges and that discovering the state of the system (via things like our active registry/repository) is going to be a big help.
I would just add that one big reason the tools aren’t yet where they need to be is that the tools vendors tend to focus on making life easy for the developers (and after all this is easy to understand since the developers are their customers) but the point of SOA is not to make life easy for the developer of a service – it’s to make life easy for the consumer of the service – which might (and probably should) be someone else.

Advertisements

3 responses to “Gregor Hohpe’s JavaZone presentation on Infoq

  1. Eric:
    hi, just chiming in. The way I always reason about transaction is that it is all about achieve “state alignment” between all participants in the transaction. That is to say, at the end of a particular operation request involving multiple participants, each participant is in a known, predictable state, whether no, one or more errors occur during the processing of the request.
    There are probably an infinite way to achieve state alignment between participants, and building you own way to do that is not entirely stupid, though it might not be cost effective. One must understand that achieving state alignment is not easy at all.
    So no wonder that state alignmnent is critically important to systems that record state (databases, services, …).
    I don’t mean to teach you anything – I have to much respect for your wisdom and knowledge – but just for the reader, you might want to choose a state alignment scheme that is well suited for the problem you are trying to solve.
    It all depends at the rate at which these resources need to change state and the consequences of the change of state requested. I always find it hilarious when my fidelity account tells me that I did two operations that may bring my account to a negative balance and that I shall be penalized if I ever do that. What? as Fidelity you can’t have your state aligned with mine? that’s scary. Clearly somewhere along the way, state alignment can’t be achieved and I only have the representation of the most probable state in which all the participants of the transaction (buyer, brokerage firms, exchange, seller(s)) are.
    If the request is to send a missile, you probably want a to use a state alignmnet protocol that is not based on do/compensate. Prepare/confirm might be better,…
    I just wanted to bring back the notion of state alignment which I feel should be used instead of talking about “transaction” because most people think about “financial transaction”, hence, Gregor’s note on starbuck state alignment protocol which is not desired to be the same as fidelity, or the US army.

  2. Hi Jean-Jacques,
    Thanks very much for the comment. I think you are right that the word “transaction” isn’t well understood, or anyway it’s used in many different ways.
    State alignment is a good description of the problem. Transactions were definitely developed to help maintain consistency of the computer record of real world interactions with the real world.
    Businesses like to have a “system of record” on the computer that keeps careful track of the business. A retailer likes to be able to look at the computer to see what’s on the shelf, and especially when that record is accurate. If there are a lot of inconsistencies the store may have to be closed for “inventory” i.e. manual count and reconciliation.
    Obviously a manual reconciliation between what’s on the computer and what’s in the store (or whatever) is expensive and a business will prefer to avoid this if possible.
    The saying goes that it’s cheaper to prevent a mistake like this than fix it after the fact. Telephone companies spend a lot of effort ensuring their bills are accurate, as do banks with their monthly statements, since it is very expensive to fix an error in either one.
    So it kind of boils down to whether or not you want to try to have an accurate and consistent picture of the world on the computer, and the extent to which the state of the real world needs to be accurately reflected on the computer, and how difficult it is to achieve this in the first place and maintain it over time.
    A business might have more than one system of record but that introduces other complications. A lot of the debate about transactions really has to do with how much the system of record has to be unique and consistent.
    Software transactions help with that. Sure, you can write you own – I used to have to do that all the time since I started coding before transactions were widely available in file systems and databases. But it is very hard, nearly impossible in fact, to code something bulletproof – to the extent that you can always avoid manual intervention when recovering from a failure (which by the way occur far more often due to software bugs and operator error than to hardware or network errors).
    It’s possible that modern systems such as EJB and COM+ have abstracted software transactions too much, made it too easy to use them, and that they have been used when they shouldn’t have. And of course in that case the cost may very rightly seem unreasonable.
    However if used correctly and where necessary, software transactions are still an important part of the toolbox, and something that you really should not be thinking of coding yourself.

  3. Hmmm. WS-CDL is being used in lots of places. It is shame that there are not more implementations despite vendors being asked by some of their clients for them. I cannot figure why this is so unless in is grounded in the myth that WS-CDL and WS-BPEL complete. We could argue that Java and WS-BPEL compete and that UML and WS-CDL complete but we don’t. Rather we look for ways in which they solve problems that customers need to be solved.
    It is certainly the case that WS-CDL does solve problems that neither WS-BPEL and UML solve, this is why ISO, ISDA and TWIST use WS-CDl to describe their processes and why some customers use WS-CDL to describe their SOA arhcitectures in the large. In all cases none give up the right to use WS-BPEL or UML. Rather they want to step-out to these technologies (standards) as they see fit to deliver their systems.
    A lot of work had been put into WS-CDL and those scant tools that are available to address this need to step-out.
    In all new technology we have those that are slow to move and accept the benefits and those that are fast to move because they have no time to waste. The web itself is a great example of this and Microsoft was late to the party but when it saw the benefits it changed overnight. Maybe the same will happen for WS-CDL because the wave is coming.
    Those that move faster are more likely to reap the rewards at lower cost than those that lag.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s