Monthly Archives: July 2008

Abstraction and control in REST vs RPC

One of the biggest debates in the software industry is about getting the level of abstraction right. By this I mean a level of interaction with computers higher than binary code or machine language – in other words, anything that presents humans with a more natural or intuitive abstraction of a CPU’s instruction set and binary data storage format.

Computers are after all fundamentally stupid electrical systems that have to be told exactly what to do. At the end of the day, everything is just 1s and 0s – the bit is either on or off. But it is really hard for us humans to work with computers at that level, so we keep trying to make it easier for people to tell computers what to do. Getting the abstraction right is key to developer productivity, but it’s a constant struggle. Abstraction layers typically remove flexibility and control from one level in order to simplify things at the next.

Years ago you could hear developers saying they would never use COBOL because it was much too slow and verbose compared to assembler. Yet COBOL remains a very widely used language.

It was not too long ago you heard the same complaint about relational databases – they were just too slow compared to hierarchical and network databases. And don’t forget the Web – I remember when we called it the “world wide wait” since it took so long to render a basic page.

In other words, it is sometimes more important to make it easier for people to tell computers what to do than it is to give them complete control. When we introduced high-level “English-like” language abstractions for the the Digital business software product set on VMS in the late 80s (which included an RPC in the ACMS product) we endured constant complaints from developers who wanted more flexibility and control. In those days distributed computing programmers were used to the fine-grained control over remote conversations available in the dominant LU6.2 peer-to-peer protocol.

In fact at the time most transaction processing people thought we were nuts to base a TP monitor on RPC. They thought the only way to do distributed computing was to explicitly program it. We thought it was more important to make the developer’s life easier by abstracting the distributed programming steps into what we called the Task Definition Language (TDL). You still knew you were calling a program in another process, but you didn’t have to open the channel, establish the session, format the data, check that every send that needed one had a reply, etc. (ACMS is still in production in some pretty demanding environments.)

This afternoon I finally caught up up on Steve Vinoski’s recent article and blog entries about the “evils” of RPC. If you aren’t already among those who have read them thoroughly, I’d encourage you to. Including the comments, it’s one of the best discussions of the merits and demerits of RPC and REST that I’ve ever seen. The core of his argument is that the RPC abstraction is not helpful – in fact the opposite. Explicit programming is preferable when creating distributed applications.

As someone in the middle of designing another RPC based system (Distributed OSGi), though, I’d like to weigh in with a few thoughts. 😉

As I’ve said before, the distributed OSGi design does not really represent a new distributed computing system. The goal of distributed OSGi is to standardize how existing distributed software systems work with OSGi, extending the OSGi programming model to include remote services (today the standard only describes single JVM service invocations).

Because the design center for OSGi services is the Java interface, RPC or request/response systems are a more natural fit than asynchronous messaging. In fact because we are trying to finish up our work this year 😉 we have postponed tackling the requirement for asynchronous communication systems to the next version.

Anyway, after carefully reading the article and blog entries, I believe Steve is not against RPC per se. He wants people to think before just automatically using it because it’s convenient. He wants to be sure anyone involved in building a highly scalable, highly distributed system considers the superior approaches of REST and Erlang, which were designed specifically for those kinds of applications. Absolutely no argument there. I am a big proponent of using the right tool for the right job, and RPC is not always the right tool.

In the OSGi work, we acknowledged early on in the process that transparent distribution is a myth. We recognize that constraints may be imposed upon an OSGi service when it is changed from a local service to a remote service, including data type fencing, argument passing semantics, latency characteristics, and exception handling. All of this has been the subject of lively debate, but the benefits of introducing distributed computing through configuration changes to an OSGi framework far outweigh the liabilities.

In our case, therefore, making it easier for OSGi developers to create and deploy distributed services is more important than the loss of flexibility and control available when using local services only. The biggest cost of software development is still human labor, and providing helpful abstractions, incluiding RPC, continues to be an important goal.

This isn’t to say anyone should blindly use RPC, or use RPC in place of REST where REST is more appropriate. It is simply saying that ease of use – making it easier for humans to do something like distributed computing – can sometimes be more important than technical concerns such as being too verbose or too slow.