Complexity in REST and SOAP

Ok, so let’s take a stab at the main question Pete raised in his comment to the previous blog entry, and which has generated some further comments. Hopefully this will generate further good and helpful discussion…
The main question as I understand it is what are the use cases for Web services as distinguished from the use cases for REST, and why would you use one or the other. Also when is the added complexity of Web services justified.
This is probably a bit of an over simplification, but to me the best summary is that you use REST when you want to display XML data in a browser, and you use SOAP when you want to pass XML data to a program.
The situation regarding the use of Amazon.com’s Web services is that last year, 80% of the developers used the XML over HTTP version and 20% used the SOAP version. This makes sense because most of Amazon.com’s users are interested in displaying the data in a browser in virtual storefronts. However, Amazon.com includes both flavors in its “Web services” since they also have developers interested in passing the data to programs. (For the sake of clarity I’ll adopt Amazon’s current terminology (REST and SOAP Web services).)
One way to look at REST is that it basically means putting Web service requests into the URL.
This the URL I get when I type “Skateboots” into Google.com’s search window:
http://www.google.com/search?hl=en&ie=UTF-8&q=%22Skateboots%22&btnG=Google+Search
Google basically takes my input to their Web form and appends it as parameters to their URL, with some other stuff about the character set and operation to be performed (i.e. “Search”), which executes their service and returns the results in my browser. Google also asks me if I meant “Skate Boots” (i.e. two words instead of one). This is the URL generated when I clicked on that link:
http://www.google.com/search?hl=en&ie=UTF-8&q=%22Skate+boots%22&spell=1
The recently released (August 2004) Web services V4 from Amazon.com continues to provide both REST and SOAP requests for its Web services. Now, some people would argue that REST and Web services are different things, while others would argue that you could do everything with REST that you can do with Web services. To me it seems like different formats for essentially the same operations.
The difference seems to boil down to whether or not you want to use a browser to interpret the result data. If you do, then you might as well use the REST requests (i.e. place the parameters in the URL). However, if you are interpreting the data in a program, then you should probably use the SOAP requests since the parameters are carried within the XML message (instead of in the URL).
This is the format Amazon gives for its REST based Web services requests:
http://webservices.amazon.com/onca/xml?Service=AWSECommerceService
&SubscriptionId=[your subscription ID here]
&Operation=ItemSearch
&SearchIndex=Books
&Keywords=dog
&ResponseGroup=Request,Small
After you get a subscription ID (I am sorry but I did not go so far as to obtain one) you can simply type the request parameters into the address box on your Firefox (or other) browser, hit return, execute it, and view the XML results. Nothing could be simpler.
But if you wanted to send the XML data to a program, possibly using another transport, the SOAP request format will do the job better since it’s designed for that, and everything the message needs is within the envelope (i.e. no dependence on transport features such as fixed interfaces and URL parameters). With the REST request it’s necessary to write code yourself for everything other than displaying the XML in the browser. And when you add features important to enterprise applications, such as security, reliability, and transactions, you also have to figure out how to write the code for that. The code could get pretty complex pretty quickly.
Now the next part of the discussion normally revolves around development tools, since they are pretty important in the use of SOAP (and WSDL for that matter). The basic question here is whether tools like XMLSpy, Visual Studio, WebLogic Workshop, or Artix Developer make life easier or harder. According to Jeff Webb, SOAP is harder if you want to change the auto-generated VB classes. If you just want to use the SOAP request unmodified, and take advantage of the generated code, however, then that is probably the simplest and most straightforward way to handle your XML.
And there’s the question of WSDL, which REST requests don’t have. Publishing a WSDL file on the Web ensures that your Web service becomes part of a larger “eco-system” since most Web services toolkits can import WSDL and automatically generate not only the SOAP message, but usually also the proxies and stubs to process it. Many tools are also available to generate WSDLs out of existing application metadata, including CORBA IDL, COBOL Copybooks, C++ and Java classes, EJBs, and COM objects, to name a few.
If you are already developing applications using an IDE such as Visual Studio or Eclipse, Web services support is already built in. And as more of the WS-* specifications gain adoption, they are added to the tools. For example, IONA’s Artix already supports WS-Security and automatically generates the SOAP headers for it. I think the principle about as simple as possible but no simpler definitely applies, and the reverse is true too. As complex as necessary but no more.
I think as the Web evolves “beyond the browser” it naturally encounters more complexity as it hits enterprise applications. It would be ideal if everything could be accomplished using simple HTTP interfaces and URL parameters. But the IT world beyond the Web needs reliability guarantees and transactional integrity for legal and business reasons such as ensuring the order of stock trades are carried out in the order the messages were received and verifying the identity of someone who sends a large purchase order for tons of steel. And the IT world is not going to change – not for a long time. The Web needs to adapt to the IT world, it’s as simple as that, to incorporate some of the complexity already there.
Ok – so let’s have some comments, and a good discussion!

Advertisements

13 responses to “Complexity in REST and SOAP

  1. “The difference seems to boil down to whether or not you want to use a browser to interpret the result data.”
    You’re a smart man Eric, so it’s particurly disappointing to hear you say things like that in this piece. I can only conclude that you simply haven’t given this issue a close enough look, which is unfortunate, since its important to your customers. Heck, even Dave Orchard is onside with me on the value of GET and URIs;
    “Read operations using a generic protocol (ie GET) hits the 90/10 point.”
    http://www.pacificspirit.com/blog/2004/09/28/web_services_needs_transfer_protocols_and_specific_protocols
    So please, let’s try to get past that and onto the more interesting questions which Dave asks (and implies) in that post. FWIW, here’s my response to him;
    http://www.markbaker.ca/2002/09/Blog/2004/09/29#2004-09-specific-vs-generic
    “The Web needs to adapt to the IT world, it’s as simple as that, to incorporate some of the complexity already there.”
    Mmm, ok, sure. But there are different ways to adapt, no? One in which the Web is respected as the largest and most successful distributed system in the history of Planet Earth and is suitably extended, and the other in which it’s used as a simple bit pipe; something that could have been done whether the Web had existed or not in the first place. The RESTful approach to extending the Web does the former, while SOA/WS does the latter.

  2. Remember that the 80% of amazon devs use REST claim has be coupled with “most amazon users will be developing sites with PHP”; REST and PHP go together nicely. Indeed, the fact that PHP is so popular yet free is why nobody has made strategic “investments” is SOAP stacks for it, unlike Java and .NET.
    The other nice thing about REST access to Amazon and Google is that a lot of the operations are just idempotent reads, the things GET was designed for.
    Now, if I had a need to write a client that would do a distributed TX, buying from amazon if I had enough money in my paypal account, I would be better of with SOAP -provided the services supported distributed transactions and there was a DTC somewhere for me. But since visa provide that kind of authentication in the background, I don’t need that. Which is good, as the tools are more complex. You need xmlspy or (my favourite) soapscope, you need to understand the quirks of your stack’s O/X binding, you need to work with lots of machine generated classes instead of XML. This may all be wrong, and even for “enterprise class” developers, a waste of time and money.
    The other thing SOAP is meant to provide is all that brokered/UDDI stuff. In many contexts dynamic discovery is appropriate. But Amazon aren’t interested in enabling frictionless brokers to talk to 20 different implementations of a shop service.
    So the question I have is “how much do we need to care about the wants and needs of enterprise apps?”. WS-RF may let you do SAP-over-XML, but unless you want to use SAP-like systems with better interop, why bother?

  3. mike champion

    I was going to warn you that Mark Baker was going to jump on that “whether you use a browser or a machine to interpret the data” bit 🙂 If the REST web service returns the data in an XML format with known semantics (e.g. RSS, or SOAP 1.2 via the HTTP GET binding) or some sort of semantic-enabled format such as RDF or XML with an associated ontology, then a machine can process it as well as it can SOAP. I agree with Mark that this is something we should all get past.
    I do think Eric’s point about the WSDL ecosystem is a very good one. That was the “scales fell from my eyes” moment for me — seeing the ability of off-the-shelf desktop software to import a WSDL description of a SOAP interface to some mainframe thingie, and having it Just Work. Sure, someone with Mark Baker’s skills could make this kind of thing work without SOAP/WSDL, but just about any reasonably script-literate person can do with with modern SOAP-enabled apps.
    So, one reason to use SOAP/WSDL is to enable all those handy-dandy RPC interfaces that are out there in the real world, even though we all know that they’re not great for serious work. But hey, WSDL-generated SOAP over HTTP is no more broken than POSTing a form and writing some script to process a response in a known format — it works fine if the gods smile on you, and horrible things happen if things timeout, get repeated, etc.
    The main long-term value of the web services stack isn’t SOAP per se but all the stuff in WS-* that is layered on top of SOAP/WSDL. I’m sure we all agree that this is in the 10% rather than the 90% of Dave Orchard’s 90/10 calculus, BUT THAT IS THE WHOLE POINT. If you don’t need the complexity, i.e. your data is public, nothing bad happens if there is a failure, all operations are atomic rather than multipart transactions, then don’t mess with this stuff. Nobody I know of has publicly asserted that WS-* is the right way to do simple things. But if you are in the 10%, you either have to mess with WS-* or roll your own equivalent. I’ve yet to hear a remotely plausible argument from Mark or any other REST advocate that it is easier to do complex things via roll-your-own+REST than with WS-*. I think that’s what Eric has been saying all along!

  4. You know me too well, Mike. 😎
    But regarding simple vs. complex, I’ve seen others raise that same issue for ages, but I’ve just never understood it. I mean, there’s complexity of problem, and complexity of solution, and those are two very different and largely uncorrelated things. So when you say “If you don’t need the complexity[…]”, I cringe, because you’re talking about solution complexity there, not problem complexity. What kind of masochist would opt for a complex solution when a simple solution existed? 😎

  5. Mark,
    My apologies, you are of course right about the browser bit. I expressed it badly.
    I was thinking about the difference between an application whose purpose was to input and reformat data for display in a browser, and an application whose purpose was to input and reformat data to store in a database or other enterprise application.
    I also intend to post an entry with an XML example that should help frame some of the specific questions.

  6. mike champion

    Mark, I’m saying that there are a significant group of users (who ultimately pay my salary BTW) who have complex problems. The WS-* pile-of-stuff is designed to address those problems, and it is pretty complex too.
    Logically, there could be simple solutions to the complex problems. I haven’t heard anyone in the REST camp do more than assert this possibility: Y’all talk about problem domains such as Bloglines that don’t have the pain that WS-* claims to cure and for which REST is quite adequate and appropriate out of the box. [On the other hand this bit about GETs having the side effect of marking a feed as having been read is interesting … the restful choices are either POST (thus negating the scalability advantages of caching) or GET to retrieve the data then POST to update the status, thus requiring an extra round trip to a very busy server. But I digress … ]
    So, sure you can decompose complex multi-part, multi-protocol, secured transactions into a sequence of minimal interface operations. But should you? As an analogy, consider a huge DocBook-like technical manual. E. F. Codd proved that you *could* normalize it into relations and reconstruct the manual via the minimal set of operations in the relational calculus. Nobody actually does this with real-world RDBMS technology because the pain of those 100-way joins vastly outweighs any gains from the underlying simplicity of the relational approach.
    Or yet another analogy … complexity is like a baloon — you can squeeze it into all sorts of shapes, but the overall volume remains the same. In distributed applications, you can partition the complexity so it goes into the infrastructure so that that application level is simple (WS-*) or you can partition it so that the infrastructure is simple and the application has to pull it all together (REST). Which is best in a specific situation is a design decision.

  7. There are some philosophical questions here; for me, though, it’s more of a practical matter based on the tools I have available.
    The Microsoft Soap Toolkit (COM) doesn’t let me do operations asynchronously in Office Apps. I need to go to .NET to do that and add another layer of complexity (yuck).
    On the other hand, I can use the XML Object library and hook in to the DOMDocument ondataavailable event very easily.
    Plus, since I’m working in Office, I usually want XML anyway: it’s easier to display stuff through an XML Map rather than try to individually read and write object properties to a document.
    I’d be almost as happy with a SOAP interface that was less granular than Amazon’s.
    Looking back at some of the responses here, I can see that the difference might be where and how you plan on consuming the result. Apps such as browsers, Excel, Word, InfoPath, etc. that have built-in XML support may require a lot less code when working with REST interfaces.

  8. Regarding Steve’s comments about anonymous dynamic discovery, I think that’s an idea that never really got off the ground.
    On the interest in Web service enabling SAP etc. I think (at least among our customers) there’s a high level of interest in this. SAP is investing heavily in Web services within Netweaver for that reason. The complexity of implementing reliable messaging is pretty well understood in general, and has been accomplished for a variety of transports historically – again depending on the requirements of what you want to do determining the necessary complexity of the solution.
    B2B servers were deployed sa XML transformation and transport engines in front of enterprise applications for the purpose of exchanging XML documents among trading partners. This can also be accomplished using Web services, once Web services support the typical B2B features such as reliable delivery, nonrepudiation, authentication, etc. (possibly also orchestration and addressing, depending on the solution).
    Also there seems to be a growing interest in hosting services for reuse either externally (aka “integration on demand” etc) or internally.
    At the WS-Arch meetings we had frequent statements from a large enterprise user that his company was interested in evaluating the replacement of their EDI system with Web services, basically for cost reasons. One big enabling requirment was reliable messaging (i.e. guaranteed delivery).

  9. Eric, I think you’re still missing the point on REST. It has nothing to do with what you intend to do with the data. The difference is whether you accept the HTTP verb set as your API and use it to operate on URI designated resources – sending and receiving XML – or you create additional APIs via WSDL/SOAP and use HTTP as a bit pipe.
    The turning point for me was when I reread Rob Pike and Dennis Ritchie’s paper on their Styx architecture for distrubuted systems. Using Styx the Inferno OS treats everything – everything down to the mouse – as a file system and all the Inferno components interact by reading and writing the files that populate the file system namespace. The HTTP verb set isn’t as clean as the Styx file system API but it gets the job done. For example, I’m working on a REST interface to the JMX instrumentation inside of Tomcat. Why? So that sys admins can monitor Tomcat instances via Perl scripts using LWP and XML modules rather than having to learn/use Java to do RMI and/or mess with SOAP::Lite.
    A single web app exposes Tomcat’s JMX implementation as a set of URI’s. For example, to create a new MBean you do a PUT on the MBeanServer URI specifying the MBean’s Java class and the desired ObjectName. To read an MBean’s attribute you do a GET on the MBean’s URI specifying the desired attribute.
    Granted this is not a “nasty enterprise integration” problem. It does, however, illustrate the REST approach in a vital program-to-program communication setting.

  10. Mike Champion is no doubt correct that if you’re not in the 10% of customers that need all the fancy WS-Splat stuff you shouldn’t use it. It doesn’t necessarily follow that just because you’re not using it there is no cost. Speaking as someone who works for a vendor who builds Web/WS middleware I see the cost of all that complexity everyday. Even though only 10% of our customers use the transactions, reliable messaging, etc. we still have to implement it correctly before we can ship. As a result we have longer product cycles, a larger more complex product, and a steeper customer learning curve.
    You might respond that customers that don’t need all the bells and whistles provided by our fine product should just use Tomcat instead. Unfortunately in most large IT shops that’s a WS-NoSale. The conversation with the CIO goes something like, “we just spent N million dollars on this fine middleware so you’re going to use it!” So all of the IT shop’s projects pay a price because 10% of them deal with financials that require the WS-ComplexFeatures.

  11. Eric, I’m with Ward on this one. Your comments, particularly in reply to Mark, seem to show that you’ve missed the sell on REST. Let me point you to a particular website: http://www.rexx.com/~dkuhlman/quixote_index.html
    That’s the home of Dave Kuhlman’s investigations into how to use a particular Python web framework (Quixote, http://www.mems-exchange.org/software/quixote) in a RESTful way. Particularly, read through documents 9-12, where Dave demonstrates with code how to generate GUI applications whose backend is the Quixote web application server. The app server is completely free to persist its data anywhere you want, Oracle, whatever. Or do anything else you’d want it to do. Great stuff, I personally had my REST epiphany reading those articles.
    On t’other hand, Mike’s point is that WS-* hits the 10 out of the 90/10 split. Which is eminently valid. There are plenty of companies who need that kind of thing, know it, and are willing to spend the money on the infrastructure necessary to support it. I don’t work for one, my employer is a .org, and has a long way to go to get to the point of being able to manage that kind of IT infrastructure. As a hospital, the corporate IT emphasis is mostly on control, while the actual departments and research labs simply struggle to get their jobs done.
    Good tool support seems to be necessary to make the WS stack accessible. Not strictly, of course, it’s always possible to write the required XML docs and programs yourself, given enough time and energy, but I find it interesting that the WS/REST split also seems to coincide with a payware & tools vs. DIY & open-source split. Is that accurate? And if so, is that simply a reflection of the need for tools to tame WS’s complexity?
    Dave’s site, particularly document 12, http://www.rexx.com/~dkuhlman/fsmGenerate_howto.html, presents tools for generating (more-or-less, possibly subject to argument) REST application stubs from a set of XML documents you write. I find it interesting that I’ve seen little other software designed explicitly to enable REST-centric development. Anyone else have interesting pointers?

  12. To survive in this hart competition you must deliver something special. And that is what you definitely do. So go on like this, it’s really great.

  13. The DIY faction stands to gain credibility when infrastructure invested with complexity promotes a provider culture where the nature of the ‘solution’ becomes as remote as the ‘problem’; cost benefits aside, are there real, potential liabilities associated with declarative, attribute decorated approaches backed with turnkey assurances of ‘pay no attention to the man behind the curtain, we’ve got your back’?
    Peter,
    Spelling, not unlike the letter of the law, was never my strong suit; I stand corrected – literally 😉