Category Archives: Transaction Processing

Do You Need to Develop for the Cloud?

And other notes from last week’s IASA NE cloud computing roundtable (more about the meeting format later – also note the Web site currently does not have up to date information – we are volunteers after all ;-).

We had good attendance – I counted 26 during the session, including folks from Pegasystems, Microsoft, Lattix, Kronos, SAP, The Hartford, Juniper, Tufts, Citizens Bank, Reliable Software, Waters, and Progress Software,  among others (I am sure I missed some), providing an interesting cross-section of views.

The major points were that no one is really thinking about the cloud as a place for accessing hosted functionality, and everyone is wondering whether or not they should be thinking about developing applications specifically for the cloud.

We touched on the landscape of various cloud computing offerings, highlighting the differences among Google,, Microsoft, and  Cloud vendors often seem to have started with trying to sell what they already had – Google has developed an extensive (and proprietary) infrastructure for highly available and scalable computing that they offer as Google App Engine (the idea is that someone developing their own Web app can plug into the Google infrastructure and achieve immediate “web scale”).

And had developed a complex database and functionality infrastructure for supporting multiple tenants for their hosted application, including their own Java-like language, which they offer to potential cloud customers as

Microsoft’s Azure offering seems to be aiming for a sort of middle ground – MSDN for years has operated a Web site of comparable size and complexity to any of the others, but Microsoft also supplies one of the industry’s leading application development environments (the .NET Framework). The goal of Azure is to supply the services you need to develop applications that run in the cloud.

However, the people in the room seemed most interested in the idea of being able to set up internal and external clouds of generic compute capacity (like Amazon EC2) that could be related, perhaps using virtual machines, and having the “elasticity” to add and remove capacity as needed. This seemed to be the most attractive aspect of the various approaches to cloud computing out there. VMware was mentioned a few times since some of the attendees were already using VMware for internal provisioning and could easily imagine an “elastic” scenario if VMware were also available in the cloud in a way that would allow application provisioning to be seamless across internal and external hosts.

This brought the discussion back to the programming model, as in what would you have to do (if anything) to your applications to enable this kind of elasticity in depoyment?

Cloud sites offering various bits of  “functionality” typically also offer a specific programming model for that functionality (again Google App Engine and are examples, as is Microsoft’s Azure). The Microsoft folks in the room said that a future version of Azure would include the entire SQL Server, to help support the goal of reusing existing applications (which a lot of customers apparently have been asking about).

The fact that cloud computing services may constrain what an application can do, raises the question of whether we should be thinking about developing applications specifically for the cloud.

The controversy about cloud computing standards was noted, but we did not spend much time on it. The common wisdom comments were made about being too early for standards, and about the various proposals lacking major vendor backing, and we moved on.

We did spend some time talking about security, and service level agreements, and it was suggested that certain types of applications might be better suited to deployment in the cloud than others, especially as these issues get sorted out. For example, company phonebook applications don’t typically have the same availability and security requirements that a stock trading or medical records processing application might have.

Certification would be another significant sign of cloud computing maturity, meaning certification for certain of the service level agreements companies look for in  transactional applications.

And who does the data in the cloud belong to? What if the cloud is physically hosted in a different country?  Legal issues may dictate data belonging to citizens of a given country be physically stored within the geographical boundary of that country.

And what about proximity of data to its processing? Jim Gray‘s research was cited to say that it’s always cheaper to compute where the data is than to move the data around in order to process it.

Speaking of sending data around, however, what’s the real difference between moving data between the cloud and a local data center, and moving data between a company’s remote data center?

And finally, this meeting was my first experience with a fishbowl style session. We used four chairs, and it seemed to work well. This is also sometimes called the “anti-meeting” style of meeting, and seems a little like a “user-generated content” style of meeting.  No formal PPT.  At the end everyone said they had learned a lot and enjoyed the discussion. So apparently it worked!

Stay tuned for news of our next IASA NE meeting.

Second Edition of TP Book out Today

It’s hard to believe, but the second edition of Principles of Transaction Processing is finally available. The simple table of contents is here, and the entire front section, including the dedication, contents, introduction and details of what’s changed, is here. The first chapter is available here as a free sample.

Photo of an early press run copy

Photo of an early press run copy

It definitely turned out to be a lot more work than we expected when we created the proposal for the second edition almost four years ago.  And of course we originally expected to finish the project much sooner, about two years sooner.

But the benefit of the delay is that we were able to include more new products and technologies, such as EJB3, JPA, Spring,  the .NET Framework’s WCF and system.transactions API, Spring, SOA, AJAX, REST/HTTP, and ADO.NET even though it meant a lot more writing and rewriting.

The first edition was basically organized around the three-tier TP application architecture widely adopted at the time, using TP monitor products for examples of the implementations of the principles and concepts covered in the book. Then as now, we make sure what we describe is based on practical, real-world techniques, although we do mention a few topics more of academic interest.

The value of this book is that it explains how the world’s largest TP applications work – how they use techniques such as caching, remote communications (synchronous as well as asynchronous), replication, partitioning, persistence, queuing, database recovery, ACID transactions, long running transactions, performance and scalability techniques, locking, threading, queuing, business process management, and state management to process up to tens of thousands of transactions per second with high levels of reliability and availability. We explain the techniques in detail and show how they are programmed.

These techniques are used in airline reservation systems, stock trading systems, large Web sites, and in operational computing supporting virtually every sector of the economy. We primarily use Java EE-compliant application servers and Microsoft’s .NET Framework for product and code examples, but we also cover popular persistence abstraction mechanisms, Web services and REST/HTTP based SOA, Spring,  integration with legacy TP monitors (the ones still in use), and popular TP standards.

We also took the opportunity to look forward and include a few words about the potential impact on TP applications of current trends toward cloud computing, solid state memory, streams and event processing, and the changing design assumptions in the software systems used to power large Web sites.

Personally this has been a great project to work on, despite its challenges, complexities, and pressures. It could not have been done without the really exceptional assistance from 35 reviewers who so generously contributed their expertise on a wide variety of topics. And it has been really great to work with Phil again.

Finally, the book is dedicated to Jim Gray, who was so helpful to us in the first edition, reviewed the second edition proposal, and still generally serves as an inspiration to all of us who work in the field of transaction processing.

What we Learned Writing the Second Edition of the TP Book

After proofing the index last week, Phil and I are finally done with the second edition of the TP book!

A lot has changed since the first edition came out in 1997.

For one thing, the TP monitor is no longer the main product type used to construct TP  applications. Many components formerly only found within TP monitors — such as forms systems for GUIs, databases, system administration tools, and communication protocols — have evoloved to become independent products.

And although we can reasonably claim that the .NET Framework and Java EE compliant application servers are the preeminent development and production environments for TP applications, it seems as if the three-tier application architecture around which we were able to structure the first edition has evolved into a multitier architecture.

Another big change is represtented by the emergence of “scale out” designs that are replacing “scale up” designs for large Web sites. The scale out designs tend to rely on different mechanisms than the scale up designs for implementing transaction processing features and functions – the scale out designs tend to rely much more on stateless and asynchronous communications protocols, for example.

By mid-2007 I had started to think it would be interesting to center the second edition around these new scale out architectures, like those implemented by large scale Web sites such as, eBay, PayPal, SalesForce, BetFair, etc. Phil and I had a great opportunity to learn about what these companies were doing at HTPS in October of ’07 . Unfortunately, it was impossible to identify sufficiently common patterns to do so, since each of the large Web sites has created a different way to implement a “scale out” architecture.

(BTW this was a fascinating conference, since the room was full of people who basically created the application servers, TP monitors, and relational databases successfully used in most “scale up” style TP applications. But they had to sit there and hear, over and over again, that these products just didn’t meet the requirements of large Web sites.)

Nonetheless we managed to fit everything in the book – how caching is done, replication, how replicas synchronize, descriptions of compensating and nested transactions, locking, two-phase commit, synchronous and asynchronous communications, RESTful transactions, and so on.

As in the first edition we have kept the focus on what’s  really being used in production environments (with the help of our many generous reviewers). We completely rewrote the product information to focus on current Java and .NET products and TP standards.

And finally, we idenfity the future trends toward commodity data centers, cloud computing, solid state disk, and multi-core processors, among others, which are going to significantly impact and influence the industry during the next decade or so.

One of the most interesting things I learned about in doing the book was how to design a transaction as a RESTful resource (see RESTful Web Services for more details). But once again, many of the familiar principles and concepts still apply – such as “pseudo-conversations” that have been used in TP applications for a long time to avoid holding locks during interactions with users.

Fascinating to me are the different assumptions people make in the different worlds between the REST/HTTP “scale out” designs and the mainframe-derived “scale up” designs. This is likely to remain an area of continual evolution.

Saturday’s Tribute to Jim Gray

Hearing today from a friend who attended, I find myself wishing I’d gone out for it. Pat had even sent me an early draft of his contribution. I really should have.

I understand the video will be posted in a couple of days – I will definitely check back for them.

In the meantime, John Markoff has published a nice writeup after the fact, and the LA Times di a good front page piece the day before (I notice Stonehenge also made page 1 that day).

I didn’t know Jim that well, but I knew him well enough to remember his exceptional kindness, friendship, and dedication. And as I have said many times, no one helped us more with our TP book than Jim. The last time I heard from him, in fact, was his (as usual) helpful review of our proposal for a second edition, which we are still trying to finish (hopefully this summer).

The best tribute I can think of is to do the kind of job he’d want me to do. I am just sorry it doesn’t really look like I will be hearing from him directly (and kindly, of course) how and where I didn’t quite measure up…

HPTS 2007 – time to reinvent everything?

It’s been a busy week between HPTS and the most recent OSGi enterprise expert group meeting. This entry is about HPTS. I’ll post another one about the OSGi meeting.

It was pretty clear from the presentations and discussions at last week’s HPTS Workshop that database and middleware software vendors are failing to meet the requirements of Web businesses, both large and small.


My room in the Julia Morgan designed Scripps building at Asilomar Conference Center

During the tribute to Jim Gray Tuesday evening Pat Helland showed a great video interview with Jim in which he mentions the need to scale things out, not up.

Pat Helland introduces the tribute to Jim Gray

The “scale up” thinking has a limit – there is always such a thing as the biggest box you can buy. But “scale out” does not have any limits – you can always add more computers to a cluster. The presentation from Google in particular was a really interesting explanation for some of the techniques they are using to achieve a massive scale out. Pat’s summary of it is something like “people are willing to trade consistency for latency.” And because current products are designed around the “scale up” concept, they don’t tend to work very well in the modern “scale out” environments.

One disappointment was that the scheduled presentations from were pulled at the last minute. But the other presentations from Yahoo, Betfair, PayPal, Second Life,, Rightnow, Workday, Google, and eBay all made it pretty clear that Web based businesses are developing custom solutions because off the shelf software currently does not cut it for them.

Aaron Brashears of Second Life

Mike Stonebreaker‘s presentation contained perhaps the most direct challenge to the status quo, arguing that industry changes during the past 30 years have invalidated the relational database and SQL. This provoked some lively discussion since among the attendees were several people who helped define and create the relational database and build DB2 and Oracle. One of his arguments is that the relational database just isn’t fast enough for modern requirements.

Some great discussions broke out between the representatives of the Web companies and the folks working for established software companies. For example, one of the Web guys commented that they did not want to have to buy the whole set (i.e. complete product) when all they really needed was a paring knife, yet the only choice they are offered is to buy the whole set. (And here OSGi has some potential to change that dynamic – but more about that in the next post.)

“Bloatware” is definitely part of the issue. Ironically so are many of the things a lot of us have worked to define, create, and promote during the past two or three decades around guarantees of atomicity and isolation. None of the Web companies use distributed two-phase commit, for example. They only use it to deque an element and update a database in the same local transaction. So much for all that work on WS-TX! 😉

Anyway, the idea seems much more to be how to replicate data across multiple computers, so that if one of them crashes another one can take its place, to write programs assuming that failures will occur early and often (since they are going to occur, why not assume they will occur often?), and to allow the replacement of disks and computers as needed without taking the application down.

It will be very interesting to observe over the next few years the extent to which the ideas and techniques in the custom built solutions become more widely adopted and incorporated into commercial products. One of the inevitable questions, as raised during the discussions, is how broad the market is for such things as Google’s file system and big table, or Amazon’s S3 and Dynamo.

In the old days we always talked about moving mainframe based systems to distributed environments, but maybe our mistake was in trying to replicate the features and functions of the mainframe. These new “scale out” architectures may finally accomplish it.

Some of the way too many photos 😉

View from outside the dining hall

Me and Ed Lassettre

One of the boardwalks through the dunes between the conference center and tbe beach

View of Monterey Bay from one of the boardwalks

Down at the beach, view from the end of the dunes

Down at the ocean, the tide coming in around the seaweed

Up on the path someone walks his dog

Through the rocks, the clouds and the surf

The path beside the beach, looking north

Surf and mist, looking southward

The Julia Morgan designed chapel, where the workshop is held

Is an application server needed for transaction processing?

I think I have mentioned before that Phil and I are in the middle of trying to update our 1997 book, Principles of Transaction Processing.

It is slow going. After more than 10 years, quite a bit has changed. When we first started discussing the outline for the new edition a couple of years ago, we thought that we could simply substitute the JEE compliant application server for the legacy TP monitor products, and maintain a lot of the existing book structure. After all, EJBs were designed for compatibility with existing transaction management systems (as are almost all TP related standards). EJBs preserved the classic three-tier architecture for scaling up the access to a shared database on behalf of online users, after all. And the .NET Framework is arguably another example of an application server, at least architecturally.

But the more recent trend toward lightweight containers (epitomized by Spring Beans) and infrastructure designed for SOA (such as ESBs), not to mention Microsoft’s Windows Communication Foundation and Web services, which all include transaction management, indicate we needed to take a broader view. Developers seem to be turning toward less complex environments for basic TP tasks.

It also seems as if application server based systems have encoutered scalability limits at the very high end – they seem to be considered too much overhead for Internet scale TP systems such as online travel or gambling, or to sustain the traffic levels at eBay, Google, and Amazon.

So it was interesting for me to find while researching Spring transactions, a question right on the front of the tutorial: “Is an application server needed for transaction management?”

I think the answer is “no,” because simple to moderate TP applications can use JDBC/ODBC connection objects for basic transaction management capabilities, such as transaction bracketing and multithreading, and because modern database management systems support high performance.

Of course for many medium to large scale TP applications, an application server may still be the best solution. But then again at the very high end, it seems like the application server becomes a bottleneck. The trend here seems to be toward distributed memory caching added to direct database access via the connection objects.

One problem, at least in JEE, is that developers are faced with the prospect of having to recode their applications when moving from smaller scale systems to systems that require distributed transaction processing. The code that you develop for a single (or “local”) database transaction has to be rewritten if you need to include multiple data resource managers into a distributed transaction (i.e. 2pc or “global” transaction).

The Spring folks highlight this issue as a reason for going with Spring transactions: “The most important point is that with the Spring Framework you can choose when to scale your application up to a full-blown application server. Gone are the days when the only alternative to using EJB CMT or JTA was to write code using local transactions such as those on JDBC connections, and face a hefty rework if you ever needed that code to run within global, container-managed transactions.”

Actually this nasty bit of recoding is due to a flaw in the EJB standard, which is based on JTA, which is turn is (typically) based on JTS, which itself is based on OTS. Turns out some of the vendors working on OTS had existing TP infrastructures (OTS was created in the early 90s, after all) that could not automatically promote a single resource transaction to a multi-resource transaction – i.e. could not go from a one phase commit environment to a two phase commit environment. So the standard only supports a two-phase commit environment, with one phase as an “optimization.”

When I joined IONA back in 1999 I had a hell of a time convincing the guys working on OTS that this so-called optimization was nothing of the sort. It said so in the spec, so it must be true, I guess is what they must have been thinking. But if all you are using is a single database, there’s no reason at all for an independent transaction manager (which is what OTS/JTS is). You use the transaction manager in the database, just like the JDBC/ODBC connection objects do. In that case the independent transaction manager is unnecessary overhead. But because of this compromize in the standard, indended to encourage wider adoption of the standard (i.e. more vendors could implement that design) the only choice for EJB developers has – at least for all practical purposes – been a global transaction. I mean, you have the option of using JTA directly, but this is strongly discouraged.

So how to resolve the problem? Scale up code that uses database connection objects to manage transactions? And avoid the overhead of EJB global transactions, container managed transactions, etc.? EJB 3.0 is intended to help address some of these concerns, but now developers have a wide choice of technologies to use, including JPA, JDO, Hibernate, SDO, JDBC, caching, etc. What is the best solution? Or maybe the answer is: “it depends.”

In which case I am afraid we have our work cut out for us 😉

Response to Gregor on transactions

Update June 7: I realized after I published this that I forgot to thank Gregor for his kind words about IONA. As long as I’m at it I should also have added that it strikes me that Gregor is a “right tool for the job” kind of guy, too.
Gregor has posted a very interesting reply to part of my entry about his presentation on InfoQ.
I don’t see a place to post a comment on his blog so I will answer here.
I did read through the Starbucks piece before, probably a year or so ago, and it is a nice explanation of why transactions aren’t needed in some situations. I also realize it is just an analogy to illustrate a good point (i.e. don’t automatically think you need transactions, use them appropriately etc.). I am a “right tool for the job” type person so I completely agree with him on that.
But I started thinking about it again after watching the presentation and it struck me that it was a bit like suggesting that you would consider using transactions to run a crane or something. Transactions are designed to capture data about something in the real world, not to interact with the real world directly. So of course you wouldn’t use transactions to serve coffee.
Anyway, I think one of the reasons I reacted to it a bit differently when he brought it up in the presentation (which I still absolutely recommend watching) is that there’s been a lot of negative stuff about transactions lately – and this just as I’ve been finishing up about 5 years of work on the specifications for Web services transactions. I think transactions are being misrepresented as something evil for an SOA environment, which they are not.
I really hope I am not guilty of being too religious about this stuff just because I’ve been working on it for a long time 😉 But I really think that transactions should not be dismissed as inappropriate for SOA without first evaluating whether or not you need them, or would benefit from them. And to do that it’s necessary to understand what transactions were designed for, and how they should be used appropriately.
It is definitely interesting to see some of the alternatives being proposed. Mark Little and I proposed one ourselves in the WS-CAF work – one that supports the federation of different transation models, and is designed to support long running asynchronous environments like those in loosely coupled SOA and BPM types of deployments. As far as we could tell though no one seemed very interested in it, and we’ve just sort of let the spec sit there.
I find it hard to believe (in following one of the links Gregor posted) that eBay doesn’t use any transactions at all, but I haven’t had time to really read through everything yet. I can certainly believe they don’t use distributed transactions, but database transactions are pretty fast. Anyway, I will finish reading up on those things – thanks for the pointers. This comes at a good time because Phil and I are trying to finish up the second edition of Principles of Transaction Processing
ps Gregor – the reason I didn’t meet you at this year’s Microsoft MVP Summit is that I could only attend the first day and a half…maybe next time!

Thinking about Jim Gray (again)

Yesterday I wrote “Thinking about Jim Gray” because I had been thinking about him on and off for most of the week. It looked like the search was over, and I wanted to say something.
Today I find myself unable to stop thinking about him. It’s partly because I’ve been working on updating the TP book, and that keeps him in my thoughts because of his relationship to the book, but it’s also because updating the book involves a lot of tedious work, and my mind tends to wander off.
So much is out there about him. I take breaks from the manuscript and search the news and the blogs. It’s unbelievable. Now the search is continuing through private efforts and searching photos on the Web.
I did go to the Amazon site, and I went through some of the photos. It seemed like searching for the proverbial needle in a haystack, but I guess you never know what might help. Can you imagine if one of us finds him that way? It is already becoming a phenomenon.
A good place to find out what’s going on is the Tenacious search blog. It summarizes what everyone is doing — computer analysis, postering, analyzing cell phone records, shipping records, Web cams, reports from the family, etc.
You can also see some of the photos here in a different format from how they’re presented on Amazon.
Some folks suggest Jim might just have kept on sailing to Mexico or somewhere else across the Pacific…
If we knew what happened to him, that would be one thing. For example, I do not want to write in the past tense, not yet. Although the news isn’t good, there is still hope.
I guess what mainly strikes me is the huge amount of interest. Everyone who works with him says what a great guy he is, and it’s amazing how much he’s contributed to computer science. Everyone seems to feel about him the way I do — as a friend, but more — someone to really look up to.
As I work on the book I find myself thinking about him and the example he set.
Wherever you are, Jim, I hope you can sense some of what’s going on – and see how you have truly touched so many lives. So many of us thinking about you, and still hoping you are well.

Thinking of Jim Gray

Update Feb 3. Help find Jim. Also: Blog tracking search progress. (Doesn’t require logging into Amazon.)
He is not only one of the smartest guys I ever met, he’s also one of the nicest.
I was lucky enough to be working at Digital in the late 80s when we started hiring the big brains in transaction processing — including Jim — to help us compete against IBM. Never mind that we should probably have been focused on personal computers, instead. For someone whose first job was converting batch applications into online TP applications, and who loves TP as much I do, it was a great time.
He visited Digital’s TP heaquarters in Littleton, MA (where I worked) often and his presence was always felt throughout the building.
I last heard from him about a year ago, when he sent comments on the proposal for the second edition of the TP book Phil and I wrote (we are working on that second edition right now in fact).
Although it might have been easy for him to consider our effort to be in competition with his book, he actually gave us the most thorough review or the manuscript and some very helpful comments, and agreed to write the foreword.
The tremendous industry reaction, as seen in the many articles and blog entries, is completely understandable. Jim invented, or helped invent, many technologies fundamental to the way the world works every day, including the relational database, high availablilty and fault tolerance mechansims, scalability algorithms, transaction processing mechanisms, and many many others. Yet you would never hear him boast about it. He always preferred to think of himself as part of the team.
When he won the Turing prize I sent him a congratulatory email and received back a characteristically very nice and humble message.
I cannot be too sad, because he is still just missing, but that news was a great shock, and the fact that it has been almost a week now is not good.

WS-Transactions update

Ok, so now that I’ve written about what we did after we finished our work, how about I write about the work we did?
After the WS-TX face to face meeting in Redmond, Stefan Tilkov emailed me some questions which ended up in an interview posted on the InfoQ site, which covers the big picture pretty well.
Since WS-TX was chartered about a year ago, we have been working to refine the three submitted V1.0 specs and progress them to achieve a status as the adopted standards of an independent consortium. Ultimately standards are all about adoption – many specs have been written that go nowhere, while other technologies have become standard without ever going through the committee process.
For the WS-Transactions specifications I’d say we are getting what you’d call sufficient critical mass: IBM, Microsoft, Red Hat (JBoss), and IONA all currently provide implementations. In addition we also have regular participation from Sun, Hitachi, Oracle, Nortel, Fujitsu, Adobe, Tibco, Choreology, and individuals (John Harby) – all of whom attended this face to face either in person or via phone. This is pretty good considering the work is nearly completed.
So what do we spend our time doing? Processing issues from the issue list.
The email archives are public, as are the documents and meeting minutes. You can tell we spend a lot of time debating the exact wording.
The standards process is all about opening up the discussion to anyone interested in participating. The goal is consensus and broad adoption and typically the price is speed. It is harder to get a larger group to agree on something than it is to get a smaller group to agree on something.
In case of Web services specifications, many times the specifications are fairly mature by the time they are submitted to an OASIS or W3C committee.
In general the way the work progresses is that anyone interested can review a specification and submit an issue. The TC members then debate and typically vote to resolve an issue. Often resolving an issue involves a change to the specification, and the cycle starts again. The expectation is that each new cycle brings the specification closer to completion, because the issues become less significant as the spec is refined.
Because we are getting close to the end, we have been spending more time on “fit and finish” issues and polishing up the text than we did when we first started. One of the major issues for the recent face to face was getting the specs consistent with RFC 2119 (yes, there is a standard for using certain words in standards ;-).
On the WS-TX TC home page you can also find links to home pages for each of the three specifications:
WS-C V1.1 and WS-AT V1.1 entered their 60-day public review mid September, and so we also had the chance at the F2F to discuss and resolve issues submitted during the public review process, which is basically the final cycle. WS-BA 1.1 will be going into the public review phase soon, again based on the work we did during the F2F.
Once the specifications have completed their public reviews the next step will be to submit them to become OASIS standards. If they are accepted, the work of the TC is essentially completed.