Daily Archives: May 19, 2007

Gregor Hohpe’s JavaZone presentation on Infoq

I have a lot of blogging topics to catch up on, but I just watched Gregor Hohpe’s presentation on Infoq and wanted to write about it while it’s fresh in my mind.
I really thought it hit home on a lot of very important points about SOA, especially things that developers (as well as the industry) need to think about.
I also would like to take him to task on a few items, including the fact that Starbucks does not need two-phase commit (I think this propagates a misunderstanding about transactions rather than illustrating how best to use them), but more about that later.
I really like Gregor’s book, and was in the same room with him once (the 2005 Microsoft MVP summit, since he and I are both Archiecture MVPs) but failed to introduce myself, something I still regret. He has a good, clear way of understanding integration issues and recommending how to deal with them. He does a great job with the presentation, clear, not too serious, and with a great perspective that acknowledges why things are the way they are even when saying that they are not exactly right.
I would definitely encourage everyone interested in development issues around SOA to sit through this presentation. It’s only about an hour – a little less – but Gregor covers a lot of very good territory, and raises a lot of good issues. Although I have to say that Gregor’s focus is on integration, while the main reason for SOA is reuse (not that these are necessarily in conflct, it’s probably more a matter of emphasis).
Ok, back to the Starbucks thing. Of course Starbucks doesn’t use two-phase commit – that would be like hammering a nail with a screwdriver – wrong tool for the job in other words. I completely understand the good advice not to use 2PC for this kind of scenario – what mystifies me is that someone would think it’s a good idea in the first place.
Transactions exist to coordinate multiple operations on data in order to record a real world transaction. No data operations, no transactions. Many argue that a retrieve operation is not a transaction, since the data isn’t changed.
And recording a real world transaction almost always involves multiple operations on data that need to be treated as a unit. Take the Starbucks example – the store has to record the payment and the fact that the drink was delivered. If either the payment isn’t given or the drink isn’t delivered the transaction didn’t happen and the record of it needs to be erased.
Computers being touchy electrical devices (as opposed to say, pen and paper), they can easily (and predictably, according to Murphy and everyone else) fail between data operations, and transactions are what ensures that the system of record (i.e. the database) matches the real world (i.e. if part of the real world transaction doesn’t complete, the database transaction rolls back one of the changes).
Starbucks no doubt uses some kind of database to record real world transactions. Therefore they no doubt also use software transactions, since it’s very hard to buy a database these days that does not support transactions, and even if you could you wouldn’t want to (personally I remember coding my own locking and doing manual recovery on an old HP Image database circa 1980, pre-transactions, but that’s another story).
Two phase commit (aka distributed transactions) is intended to be used when data operations involving multiple databases (aka transactional resources) need to be coordinated. I know that Gregor knows this, and is trying to illustrate the point that you should not automatically assume that you should use 2PC even if you can, but I for one think he could come up with a better analogy. In particular it would be helpful to illustrate when 2PC is worth the cost, and not just say “never use it.” In talking about most of the other topics he does usually get into a discussion about tradeoffs and when to use what, but in this case the advice is too simplistic for my taste.
Update 5/21 – this seems too long so I’m splitting it here…

Continue reading