Do You Need to Develop for the Cloud?

And other notes from last week’s IASA NE cloud computing roundtable (more about the meeting format later – also note the Web site currently does not have up to date information – we are volunteers after all ;-).

We had good attendance – I counted 26 during the session, including folks from Pegasystems, Microsoft, Lattix, Kronos, SAP, The Hartford, Juniper, Tufts, Citizens Bank, Reliable Software, Waters, and Progress Software,  among others (I am sure I missed some), providing an interesting cross-section of views.

The major points were that no one is really thinking about the cloud as a place for accessing hosted functionality, and everyone is wondering whether or not they should be thinking about developing applications specifically for the cloud.

We touched on the landscape of various cloud computing offerings, highlighting the differences among Google, SalesForce.com, Microsoft, and Amazon.com.  Cloud vendors often seem to have started with trying to sell what they already had – Google has developed an extensive (and proprietary) infrastructure for highly available and scalable computing that they offer as Google App Engine (the idea is that someone developing their own Web app can plug into the Google infrastructure and achieve immediate “web scale”).

And Salesforce.com had developed a complex database and functionality infrastructure for supporting multiple tenants for their hosted application, including their own Java-like language, which they offer to potential cloud customers as Force.com.

Microsoft’s Azure offering seems to be aiming for a sort of middle ground – MSDN for years has operated a Web site of comparable size and complexity to any of the others, but Microsoft also supplies one of the industry’s leading application development environments (the .NET Framework). The goal of Azure is to supply the services you need to develop applications that run in the cloud.

However, the people in the room seemed most interested in the idea of being able to set up internal and external clouds of generic compute capacity (like Amazon EC2) that could be related, perhaps using virtual machines, and having the “elasticity” to add and remove capacity as needed. This seemed to be the most attractive aspect of the various approaches to cloud computing out there. VMware was mentioned a few times since some of the attendees were already using VMware for internal provisioning and could easily imagine an “elastic” scenario if VMware were also available in the cloud in a way that would allow application provisioning to be seamless across internal and external hosts.

This brought the discussion back to the programming model, as in what would you have to do (if anything) to your applications to enable this kind of elasticity in depoyment?

Cloud sites offering various bits of  “functionality” typically also offer a specific programming model for that functionality (again Google App Engine and Force.com are examples, as is Microsoft’s Azure). The Microsoft folks in the room said that a future version of Azure would include the entire SQL Server, to help support the goal of reusing existing applications (which a lot of customers apparently have been asking about).

The fact that cloud computing services may constrain what an application can do, raises the question of whether we should be thinking about developing applications specifically for the cloud.

The controversy about cloud computing standards was noted, but we did not spend much time on it. The common wisdom comments were made about being too early for standards, and about the various proposals lacking major vendor backing, and we moved on.

We did spend some time talking about security, and service level agreements, and it was suggested that certain types of applications might be better suited to deployment in the cloud than others, especially as these issues get sorted out. For example, company phonebook applications don’t typically have the same availability and security requirements that a stock trading or medical records processing application might have.

Certification would be another significant sign of cloud computing maturity, meaning certification for certain of the service level agreements companies look for in  transactional applications.

And who does the data in the cloud belong to? What if the cloud is physically hosted in a different country?  Legal issues may dictate data belonging to citizens of a given country be physically stored within the geographical boundary of that country.

And what about proximity of data to its processing? Jim Gray‘s research was cited to say that it’s always cheaper to compute where the data is than to move the data around in order to process it.

Speaking of sending data around, however, what’s the real difference between moving data between the cloud and a local data center, and moving data between a company’s remote data center?

And finally, this meeting was my first experience with a fishbowl style session. We used four chairs, and it seemed to work well. This is also sometimes called the “anti-meeting” style of meeting, and seems a little like a “user-generated content” style of meeting.  No formal PPT.  At the end everyone said they had learned a lot and enjoyed the discussion. So apparently it worked!

Stay tuned for news of our next IASA NE meeting.

Second Edition of TP Book out Today

It’s hard to believe, but the second edition of Principles of Transaction Processing is finally available. The simple table of contents is here, and the entire front section, including the dedication, contents, introduction and details of what’s changed, is here. The first chapter is available here as a free sample.

Photo of an early press run copy

Photo of an early press run copy

It definitely turned out to be a lot more work than we expected when we created the proposal for the second edition almost four years ago.  And of course we originally expected to finish the project much sooner, about two years sooner.

But the benefit of the delay is that we were able to include more new products and technologies, such as EJB3, JPA, Spring,  the .NET Framework’s WCF and system.transactions API, Spring, SOA, AJAX, REST/HTTP, and ADO.NET even though it meant a lot more writing and rewriting.

The first edition was basically organized around the three-tier TP application architecture widely adopted at the time, using TP monitor products for examples of the implementations of the principles and concepts covered in the book. Then as now, we make sure what we describe is based on practical, real-world techniques, although we do mention a few topics more of academic interest.

The value of this book is that it explains how the world’s largest TP applications work – how they use techniques such as caching, remote communications (synchronous as well as asynchronous), replication, partitioning, persistence, queuing, database recovery, ACID transactions, long running transactions, performance and scalability techniques, locking, threading, queuing, business process management, and state management to process up to tens of thousands of transactions per second with high levels of reliability and availability. We explain the techniques in detail and show how they are programmed.

These techniques are used in airline reservation systems, stock trading systems, large Web sites, and in operational computing supporting virtually every sector of the economy. We primarily use Java EE-compliant application servers and Microsoft’s .NET Framework for product and code examples, but we also cover popular persistence abstraction mechanisms, Web services and REST/HTTP based SOA, Spring,  integration with legacy TP monitors (the ones still in use), and popular TP standards.

We also took the opportunity to look forward and include a few words about the potential impact on TP applications of current trends toward cloud computing, solid state memory, streams and event processing, and the changing design assumptions in the software systems used to power large Web sites.

Personally this has been a great project to work on, despite its challenges, complexities, and pressures. It could not have been done without the really exceptional assistance from 35 reviewers who so generously contributed their expertise on a wide variety of topics. And it has been really great to work with Phil again.

Finally, the book is dedicated to Jim Gray, who was so helpful to us in the first edition, reviewed the second edition proposal, and still generally serves as an inspiration to all of us who work in the field of transaction processing.

IASA NE Roundtable June 23 on Cloud Computing

I just wanted to pass along the notice for the June 23 meeting of the IASA NE Chapter, which will be a roundtable on “cloud computing” hosted by Microsoft but chaired by Michael Stiefel of Reliable Software.

Details and registration

(No, you do not need to be a member of IASA although of course we encourage that. Basic membership is free, and full membership is only $35. )

What should be interesting this time is that everyone seems to be doing something slightly different around cloud computing, whether it’s Amazon, Microsoft, Google, VMware, SalesForce, etc. Cloud computing is definitely an exciting new area, but like many other new and exciting areas it is subject to considerable hype and exaggeration. Good questions include:

  • What exactly can you do in the cloud?
  • What are the restrictions, if any, on the kind of programs and data you can put “in the cloud”?
  • Can you set up a private cloud if you want to?

I think the trend toward cloud computing is closely related to the trend toward commodity data centers, since you kind of need one of those to offer a cloud service in the first place. Like the ones James Hamilton describes in this powerpoint presentation, which I heard him present at HPTS 2007.  (Looks like James has left Microsoft and joined Amazon BTW.)

I would expect a lot of heated discussion from the folks who usually attend the IASA meetings. Attendance has been steadily increasing since we founded the local chapter about a year ago, so I would hope for and expect a very lively discussion.

As usual, the event includes networking time and food & drinks (not usually beer though – have to work on that I guess 😉

Please be sure to register in advance so Microsoft knows how much food to buy.

Thanks & hope to see you there!

Eric

What we Learned Writing the Second Edition of the TP Book

After proofing the index last week, Phil and I are finally done with the second edition of the TP book!

A lot has changed since the first edition came out in 1997.

For one thing, the TP monitor is no longer the main product type used to construct TP  applications. Many components formerly only found within TP monitors — such as forms systems for GUIs, databases, system administration tools, and communication protocols — have evoloved to become independent products.

And although we can reasonably claim that the .NET Framework and Java EE compliant application servers are the preeminent development and production environments for TP applications, it seems as if the three-tier application architecture around which we were able to structure the first edition has evolved into a multitier architecture.

Another big change is represtented by the emergence of “scale out” designs that are replacing “scale up” designs for large Web sites. The scale out designs tend to rely on different mechanisms than the scale up designs for implementing transaction processing features and functions – the scale out designs tend to rely much more on stateless and asynchronous communications protocols, for example.

By mid-2007 I had started to think it would be interesting to center the second edition around these new scale out architectures, like those implemented by large scale Web sites such as Amazon.com, eBay, PayPal, SalesForce, BetFair, etc. Phil and I had a great opportunity to learn about what these companies were doing at HTPS in October of ’07 . Unfortunately, it was impossible to identify sufficiently common patterns to do so, since each of the large Web sites has created a different way to implement a “scale out” architecture.

(BTW this was a fascinating conference, since the room was full of people who basically created the application servers, TP monitors, and relational databases successfully used in most “scale up” style TP applications. But they had to sit there and hear, over and over again, that these products just didn’t meet the requirements of large Web sites.)

Nonetheless we managed to fit everything in the book – how caching is done, replication, how replicas synchronize, descriptions of compensating and nested transactions, locking, two-phase commit, synchronous and asynchronous communications, RESTful transactions, and so on.

As in the first edition we have kept the focus on what’s  really being used in production environments (with the help of our many generous reviewers). We completely rewrote the product information to focus on current Java and .NET products and TP standards.

And finally, we idenfity the future trends toward commodity data centers, cloud computing, solid state disk, and multi-core processors, among others, which are going to significantly impact and influence the industry during the next decade or so.

One of the most interesting things I learned about in doing the book was how to design a transaction as a RESTful resource (see RESTful Web Services for more details). But once again, many of the familiar principles and concepts still apply – such as “pseudo-conversations” that have been used in TP applications for a long time to avoid holding locks during interactions with users.

Fascinating to me are the different assumptions people make in the different worlds between the REST/HTTP “scale out” designs and the mainframe-derived “scale up” designs. This is likely to remain an area of continual evolution.

IASA NE gaining momentum – April meeting set for 23rd

After nearly a year, it is starting to look like IASA New England is starting to gain some momentum. For those of you in the area, please register here for the next meeting (April 23), at which we’ll be hearing about  Intuit’s Saas/cloud initiative from their QuickBase architect, Jim Salem.

Personally, I’m looking forward to hearing the details of their active-active load balancing…

I was sorry to have to miss the March meeting due to attending EclipseCon / OSGi DevCon, but I heard it went very well and that Hub did a great job.

At the meeting we also announced that Intuit and IBM were joining in and sponsoring the April and May meetings, respectively.  This is excellent news for the NE architect community since it means we’ll have more support and access to additional excellent speakers for the meetings.

We also announced a panel discussion on cloud computing for June, a social event for July, and a regional event for this fall.  Things are really starting to fall into place!

I’m happy about this since it will be great to have an active community of software architects in the NE area. I personally always learn something new at the meetings and have a great time discussing topics with the other members.  Hope to see you on the 23rd!

IBM/Sun Post: I Forgot About Solaris

When I wrote about IBM’s potential interest in acquiring Sun to gain control of Java, I forgot about the Solaris factor. But this was mentioned in yesterday’s Times article about the acquisition, and I have seen it mentioned other places as well.

What I forgot about was Red Hat and Linux. IBM sells a lot of Red Hat Linux. After Red Hat acquired JBoss in 2006 they began competing with IBM’s WebSphere division, which must have put strain on their partnership around Linux. IBM started hedging its bets with Novell’s SUSE Linux, but open sourcing Solaris would give IBM its own alternative to Red Hat Linux.

Add that to the potential for gaining control of Java and you have two pretty compelling reasons for IBM to acquire Sun.  Of course there are probably any number of other factors, but these strike me as the most strategic.

Starting up Modular IT Service

I wanted to get everything ready for EclipseCon / OSGi Dev Con next week, in case I could get some conversations started about OSGi adoption plans.

Of course I “prioritized” getting the Web site and email going (ok those of you who just shouted “you mean procrastinated” get 10 points 😉 and didn’t investigate what Register.com meant by including a basic Web site in the domain name registration charge.

It turns out they force you to use their Widget and designs, and limit you to a single page.  The widget is not bad, but there are some funny things.  I can’t figure out how to center the name under my photo, and when I tried to move the “blog” widget under the main text widget, the text disappeared.

One good thing though, I could include links to my blogs and to the public LinkedIn profile so I didn’t have to put too much on the page other than the purpose of the new service – which is to help organizations create strategic plans for improving product and application development, with a focus on adopting a more modular approach and using the OSGi Framework.

Hope to see you at the ‘con!

Maybe IBM wants control of Java

The hot topic of debate today is the breaking news story that IBM is in talks to acquire Sun. Dana Gardner doubts this, and a bunch of myFB and Twitter friends ask the obvious questions in their status updates: Why the heck would IBM want to do this?  

I haven’t seen anyone yet bring up the Java question.  As co-chair of the OSGi EEG and formerly 9-year employee of a Java vendor, I have seen the battles between Sun and IBM over the control of the Java langauge up close.  It has never been a pretty picture.

Recently I was asked about Jonathan Schwartz’s blog entries about Sun’s future direction and corporate strategy. The content of these entries has been subject to the usual praise and criticism, but I haven’t seen anyone talk about what’s so obviously and painfully missing – at least for someone active in the Java community and trying to push the ball forward (e.g. enterprise OSGi). Where is the talk about leading the Java community? Where is the talk about collaboration with IBM, Oracle, Progress, Tibco, and others? Where is the description of how helpful Sun is toward Apache’s Java projects (especially Harmony)?

IBM has ported many products onto the OSGi framework during the past several years, including flagship products such as WebSphere Application Server and Lotus Notes. Never mind the fragmentation in the Java community caused by the disagreement over SCA.  What about Sun’s recent announcement that they were going to reinvent Java modularity in the Open JDK project, all on their own, without input, without regard to what happens to OSGi?  What kind of potential change cost does that represent to IBM and all the other Java vendors who have ported products onto OSGi?

The potential acquisition of Sun has been debated so far mostly in terms of the business value Sun has – that is, in the context of where it is still making money, as if that were the main or only reason for an acquisition. But I say again, what about the unrealized potential for collaborative leadership in the Java community? Sun obviously isn’t paying attention to this, but  IBM might be.

IASA Meeting Postponed!

Apologies but the IASA NE meeting scheduled for tomorrow, Feb. 26 has been postponed. March 26 is the most likely new date. A formal notice of the rescheduled meeting will be available soon.

IASA NE Meeting next Thursday, Hub Vandervoort to speak

Continuing our monthly meetings for the IASA NE chapter next Thursday Feb 26 hosted by Progress Software, with Hub Vandervoort speaking on event-driven architectures. Hope to see you there.

Hub’s presentation has been well received at recent industry conferences. I’m looking forward to it. Also when I spoke with him yesterday about it we discussed the interactive format we’ve had at past IASA NE meetings, and he is definitely interested in doing Q&A/whiteboarding after the talk.

Registration, details, etc.

http://iasane.eventbrite.com/