Monthly Archives: September 2007

Open Source for Business – Really

Mike Herrick calls attention to a recent article about Pay Pal and their use of open source to run their business.

Among the more interesting aspects of the article is that Scott Thompson was working for Visa prior to taking the CTO job at Pay Pal. Visa apparently would not have considered using an open source platform for processing transactions. Of course, Visa started out a long time ago.

This is as clear an example as you can get of the difference between pre-Web and post-Web IT architectures. Pre-Web transaction processing was all about mainframes, and middleware systems that tried to duplicate the mainframe environment in a distributed world. Open source seems foreign since it didn’t really grow up as part of the traditional environnment. Post-Web transaction processing is all about scaling up and partitioning out using racks of commodity hardware and open source.

Web sites often started small, using any resource at hand (i.e. free open source software) and then needed to grow and grow to unprecendented volumes, and still maintain reasonable cost control. And traditional solutions designed to support the scale of a single enterprise did not meet the requirements of the Web environment – certainly not at a sustainable cost.

Running a business on open source? Sure, we see it every day. And with our new release of Fuse we are likely to see it more and more. Even customers who traditionally would not consider open source for the “family jewels” are starting to use it in some projects, including mission-critical applications. The Internet businesses are really seen as leading the way here.

The open source trend is not stopping at the operating system, either. That should be clear by now. Open source middleware, databases, service enablement, and routing/mediation engines are continuing to grow in popularity.

The industry has seen hardware go thorugh a commoditization cycle, in which the cheapest solutions are constructed of the best standard parts from various sources. Competition at the component level –CPU, disk, display, etc. — helped drive down prices.

Now we are beginning to see the commoditization of enterprise software, with Linux probably the big break through. Again, in the spirit of competition, it has helped drive down prices. And as the TPC has known for years, it’s not only how many transactions you can process that’s important, it’s also the cost per transaction.

It is natural now to “move up the stack” to open source middleware, database management systems, messaging systems, service enablement, routing & mediation…

And the post-Web IT architectures such as those successfully deployed at Pay Pal, Amazon.com, Google, eBay, and others are more and more being looked at as examples of how to drive down cost, improve performance and availability, and sustain constant change.

More on open source and the new Fuse release from Guillaume and James and Hiram.

Advertisements

Annoyances (1): Business Card Etiquette

Every now and then something bugs me, so I am starting a new series of occasional postings (i.e. whevener the mood strikes) called “annoyances.”

This is the first one.

When I exchange business cards with you at a conference or other industry event that is not an invitation to put my name on your email spam list!

Thanks.

Is an application server needed for transaction processing?

I think I have mentioned before that Phil and I are in the middle of trying to update our 1997 book, Principles of Transaction Processing.

It is slow going. After more than 10 years, quite a bit has changed. When we first started discussing the outline for the new edition a couple of years ago, we thought that we could simply substitute the JEE compliant application server for the legacy TP monitor products, and maintain a lot of the existing book structure. After all, EJBs were designed for compatibility with existing transaction management systems (as are almost all TP related standards). EJBs preserved the classic three-tier architecture for scaling up the access to a shared database on behalf of online users, after all. And the .NET Framework is arguably another example of an application server, at least architecturally.

But the more recent trend toward lightweight containers (epitomized by Spring Beans) and infrastructure designed for SOA (such as ESBs), not to mention Microsoft’s Windows Communication Foundation and Web services, which all include transaction management, indicate we needed to take a broader view. Developers seem to be turning toward less complex environments for basic TP tasks.

It also seems as if application server based systems have encoutered scalability limits at the very high end – they seem to be considered too much overhead for Internet scale TP systems such as online travel or gambling, or to sustain the traffic levels at eBay, Google, and Amazon.

So it was interesting for me to find while researching Spring transactions, a question right on the front of the tutorial: “Is an application server needed for transaction management?”

I think the answer is “no,” because simple to moderate TP applications can use JDBC/ODBC connection objects for basic transaction management capabilities, such as transaction bracketing and multithreading, and because modern database management systems support high performance.

Of course for many medium to large scale TP applications, an application server may still be the best solution. But then again at the very high end, it seems like the application server becomes a bottleneck. The trend here seems to be toward distributed memory caching added to direct database access via the connection objects.

One problem, at least in JEE, is that developers are faced with the prospect of having to recode their applications when moving from smaller scale systems to systems that require distributed transaction processing. The code that you develop for a single (or “local”) database transaction has to be rewritten if you need to include multiple data resource managers into a distributed transaction (i.e. 2pc or “global” transaction).

The Spring folks highlight this issue as a reason for going with Spring transactions: “The most important point is that with the Spring Framework you can choose when to scale your application up to a full-blown application server. Gone are the days when the only alternative to using EJB CMT or JTA was to write code using local transactions such as those on JDBC connections, and face a hefty rework if you ever needed that code to run within global, container-managed transactions.”

Actually this nasty bit of recoding is due to a flaw in the EJB standard, which is based on JTA, which is turn is (typically) based on JTS, which itself is based on OTS. Turns out some of the vendors working on OTS had existing TP infrastructures (OTS was created in the early 90s, after all) that could not automatically promote a single resource transaction to a multi-resource transaction – i.e. could not go from a one phase commit environment to a two phase commit environment. So the standard only supports a two-phase commit environment, with one phase as an “optimization.”

When I joined IONA back in 1999 I had a hell of a time convincing the guys working on OTS that this so-called optimization was nothing of the sort. It said so in the spec, so it must be true, I guess is what they must have been thinking. But if all you are using is a single database, there’s no reason at all for an independent transaction manager (which is what OTS/JTS is). You use the transaction manager in the database, just like the JDBC/ODBC connection objects do. In that case the independent transaction manager is unnecessary overhead. But because of this compromize in the standard, indended to encourage wider adoption of the standard (i.e. more vendors could implement that design) the only choice for EJB developers has – at least for all practical purposes – been a global transaction. I mean, you have the option of using JTA directly, but this is strongly discouraged.

So how to resolve the problem? Scale up code that uses database connection objects to manage transactions? And avoid the overhead of EJB global transactions, container managed transactions, etc.? EJB 3.0 is intended to help address some of these concerns, but now developers have a wide choice of technologies to use, including JPA, JDO, Hibernate, SDO, JDBC, caching, etc. What is the best solution? Or maybe the answer is: “it depends.”

In which case I am afraid we have our work cut out for us 😉