Further discussion of REST principles

This page is published under the terms of the licence summarized in the footnote.

For an introduction to SOA, read SOA and the Bezos mandate.

For an introduction to REST, and a comparison with SOAP, read SOAP versus REST.

This paper is an older collection of supplementary notes and observations.

Some elements have been edited and adapted from the first article in an issue of the InfoQ magazine - REST / eMag Issue 12 - April 2014.

Contents

REST - recap. 1

Name each resource using a URI. 2

Link resources using URIs. 3

Use only the operations in a standard web protocol/interface. 4

Flexibility of representation (media/data type) 7

Communicate statelessly. 8

Conclusions and remarks. 9

REST - recap

The term REST was defined by Roy T. Fielding; formerly a designer of Web protocols including HTTP and URIs.

Roy formalized ideas behind these protocols in his Ph.D. thesis.

REST gives software designers a way to make use of what the Web is good at, especially hyperlinking..

A RESTful approach is different from RPC, Distributed Objects and SOAP.

It is not so much about asking a servant to do something for you as asking a servant to act on a remote resource.

Representational State Transfer means a server represents its state in messages using internet-friendly text data formats, which can include hyperlinks.

A resource is anything that can be given a domain name, can be identified by one or more Uniform Resource Identifiers (URIs).

· a document or image,

· a temporal service (e.g. "today's weather in Los Angeles")

· a collection of other resources

· a chunk of related information, such as a user profile

· a collection of updates (activities)

· a global user ID (GUID)

· a non-virtual object (e.g. a person), and so on.

A so-called RESTful architecture contains RESTful client components.

Every resource of a software application (Web Service, web site, HTML page, XML document, printer, other physical device, etc.) is named as a distinct web resource.

A client component can only call a server component/resource using the operations available in a standard protocol - usually HTTP.

This decouples distributed components; it means a client component needs minimal information about the server resource.

A so-called REST-compliant architecture contains REST-compliant server components/resources.

A REST-compliant server component/resource can offer only the services named in an internet protocol.

Given there are fewer operations (verbs) per component, there must be more components (nouns).

One may have to divide a large data resource into many smallish elements (many nouns).

Clients must orchestrate or integrate those smallish resources.

The following sections include a discussion of REST principles, in addition to the one in the earlier paper.

Name each resource using a URI

Tim Berners-Lee’s Axiom 0 for Web Design is "All resources on the Web must be uniquely identified with a URI."

URIs are universal identifiers, cheap to create and use.

This globally-unified naming scheme helps both your personal use of the Web (via your browser) and communication between software components.

Using URIs to identify your resources means that each gets a unique identifier in a global namespace.

You don’t have to come up with your own naming scheme.

URIs works on global scale, and are understood by most people.

Human-readable URIs are not a pre-requisite for a RESTful design – but they help.

Here are some URIs which identify individual items of interest:

· http://example.com/customers/1234

· http://example.com/orders/2007/10/776654

· http://example.com/products/4554

· http://example.com/processes/salary-increase-234.

Resources that merit a URI are often aggregate objects (containing more than a row in a database table).

For example, an order includes several order lines, an address, and many other attributes.

You can name persistent collections of interest as well as individual items.

· http://example.com/orders

· http://example.com/products.

You can name enquiry results as well as persistent lists.

· http://example.com/orders/2007/11 (identifies all orders submitted in November 2007)

· http://example.com/products?color=green (identifies the set of green products).

The idea is to uniquely identify everything of interest in a software application.

REST tends to create components you usually don’t see in a typical application design: a process, a process step, a sale, a negotiation, a request for a quote.

This, in turn, can lead to the creation of more persistent entities than in a non-RESTful design.

All the resources that your application uses are given URIs, be they individual items or collections, virtual or physical objects, or computation results.

Observations

Graham: A component, or resource, could be a Web Service, a web site, an HTML page, an XML document, a physical device.

The idea is that each of the many resources is identified by a URI.

So we don’t need any special infrastructure platform to allocate object identifiers; the DNS does that.

Ugo: Further, you don't have to know about all those resources ahead of time.

You need to know just one single resource to get the application started.

The rest you will discover along the way, given URIs returned by the first component.

Graham: So, a client doesn’t have to know anything about the overall structure in which components cooperate.

The name space for a RESTful application is (potentially) the whole internet.

Link resources using URIs

We humans navigate around web pages by following hyperlinks, without having to remember the URIs.

So, RESTful software navigates around web resources by following hyperlinks.

Roy Fielding promoted “Hypermedia as the engine of application state” (HATEOAS).

This fancy name means that an application can retrieve a resource containing hyperlinks, and use those links to access related resources.

Suppose a server returns the data structure defined in this XML fragment:

</order>

An application can follow the product and customer links in this document to retrieve more information.

The beauty of using URIs is that links can point to resources provided by a different application, server, or company on another continent.

In this way, all the resources that make up the Web can be linked to each other.

A client doesn’t have to know anything about the overall structure of the system it is a component in.

It invokes operations on remote components/resources without knowing what machines they are hosted on

You could say the name space for an application designed to REST principles is the whole internet.

Observations

Graham: I have a background in enterprise applications that process structured data.

We expect to use transaction management middleware for request-reply invocations, and to use reliable message delivery middleware for fire-and-forget invocations.

Many advocates of REST are more interested in web applications that process documents or media files.

Ugo: I think of a REST application developing according to the HATEOAS mechanism.

I cannot really tell (either in advance or after the facts) how many services are involved, what their contracts are and who is responsible for them.

Let alone find them in a “design-time service catalogue under the governance of one or more architects".

Pretty much all I have for a REST application is the root entry point plus a bunch of media types that are possibly going to be used at runtime (and the hope that no additional media types are going to get involved).

That is pretty much the opposite of the SOA "command and control" approach of having well defined service contracts and well defined business processes known at design time.

Interestingly enough, I have started seeing REST advocates say that the concept of "service" is actually harmful to REST (see http://tech.groups.yahoo.com/group/rest-discuss/message/15134 and following conversations).

Use only the operations in a standard web protocol/interface

When you enter a URI into your browser’s address field and hit return — how does your browser know what to do with the URI?

It knows because every server-side resource is accessed via a standard web protocol/interface – such as HTTP.

HTTP verbs include GET, POST, PUT, DELETE, HEAD, and OPTIONS.

The meaning of these operations is defined in the HTTP standard, along with some guarantees about their behavior.

Idempotence

HTTP is based on asynchronous message passing.

There is no promise a response will be returned to a request; this means you want the protocol to work idempotently.

Idempotence means that sending a repeat message does no harm, which is guaranteed for these HTTP operations:

· If you issue a GET request and get no reply, you can harmlessly issue the request again.

· If you issue a PUT request it will create resource at the given URI with the given data or, if it is there already, update the resource with the same data.

· If you issue a DELETE then repeat it, trying to deleting something already deleted is harmless.

Since POST can mean “create a new resource” or invoke other arbitrary processing; it is neither safe nor idempotent.

So, a client can always retrieve a representation (a rendering of it) using GET.

And because GET’s semantics are defined in the standard as “safe”, a client has no obligations when using it.

And since GET supports efficient and sophisticated caching, in many cases, a request does not have to reach the ultimate server.

If you’re used to a different design approach, these principles and restrictions may seem problematic, but consider this example.

Old style: many operations, and few component instances

Consider the following interface:

Asset Manager Component

listUsers()

addUser()

getUser()

updateUser()

removeUser()

findUser()

listLocations()

getLocation()

findLocation()

addLocation()

removeLocation()

updateLocation()

addAsset ()

assignAsset ()

If a client wants to consume these services, it needs to be coded against this particular interface.

The interface defines the component’s application-specific protocol.

REST style: few operations, but many components upon which to invoke those operations.

In a RESTful style, have to get by with the generic interface that makes up the HTTP application protocol.

This means creating a whole universe of new resources thus:

>> example to be inserted

Notice that the specific operations of the old component have been mapped to the standard HTTP operations on several new components.

A GET on a URI that identifies a User resource is just as meaningful as a getUser operation on the old component.

Observations

Graham: I gather that in REST, clients and servers are loosely-coupled in a very particular sense.

They use only the universally-recognised operation names of the standard protocol via which they communicate.

A client of a web resource/component can use only the few operations named in an internet-friendly protocol.

Ugo: Once you choose a protocol and its operations, that's it; you cannot keep changing that.

REST is not limited to HTTP and its four standard operations. But in practice, that is what is used today.

Graham: REST not only restricts a client component to using nothing but the four HTTP operation names to invoke a server resource.

In a properly REST-compliant design, it also means a server resource can offer only four services (using the four HTTP verbs).

And this means we need many small components (many nouns).

Ugo: Yes, the restriction on the number of possible operations is compensated by a proliferation in the resources.

You might or might not like the large number of resources involved. But if you use proper HATEOAS (see below) that is probably not an issue.

Graham: But suppose somebody wants to design a server resource that offers more than four operation types?

They can add an operation type attribute to the invocation parameters, and so force clients to name their required operation.

Pretty soon, they have an anti-RESTful design, featuring a few big components (nouns) offering lots of services (verbs).

Ugo: Unfortunately that is what is happening in practice because so many people have no idea of what REST really is.

A client should use HTTP operations only in the way they were designed.

Many pseudo-REST users have narrowed down to using just one operation: say POST for everything, even where that subverts the HTTP operation semantics.

That is not REST, and is not SOAP either, so you get the worst of both worlds.

Graham: So why are they using REST? Is it simply to avoid the platform infrastructure and overheads required for SOAP, CORBA whatever?

Ugo: Yes, in a lot of cases, plus the myth that REST is easier to do than SOAP or CORBA.

In reality, I think that REST done right can be more complex and less intuitive to regular programmers than SOAP.

But those are not the only reasons.

For more meaningful reasons for doing pseudo-REST (not true REST), take a look at HTTP://nordsc.com/ext/classification_of_HTTP_based_apis.html .

Flexibility of representation (media/data type)

How does a client know how to deal with the data it retrieves as a result of a GET or POST request?

The response contains metadata about the type of data in the HTTP content-type header

Thus, HTTP allows a separation of concerns between invoking operations and handling the data returned

So, a client that knows how to handle a particular data format can interact with all resources that can provide a representation in this format.

Using HTTP content negotiation, a client can ask for a representation in a particular format.

The result might be some company-specific XML format that represents customer information.

GET /customers/1234 HTTP/1.1

Host: example.com

Accept: application/vnd.mycompany. customer+xml

Say the client sends a different request, for a customer address in vCard format.

GET /customers/1234 HTTP/1.1

Host: example.com

Accept: text/x-vcard

By the way, API design is often driven by the idea that everything that doable via a human UI should also be doable via the API.

And aligning Web UI and Web API can help to produce a better Web interface for both humans and other applications.

Observations

Ugo: Clients (unless they are human) do have to know something in advance.

They have to know the syntax and semantics of the media types returned by the RESTful servers.

Graham: Where you say media types, I think data types.

Ugo: In the case of HTTP, "media type" is exactly as specified in the HTTP 1.1 specification.

A good example of media types are the ones associated with ATOM - see the standard at en.wikipedia.org/wiki.

These include the HTTP media type "application/atom +xml".
The client receives a resource representation from the REST server with a particular declared media type.

The client then can decide, based on that representation, based on its knowledge of that media type and based on its own application goals, what to do next.

Typically, the client will parse the representation document guided by the media type syntax.

Inside the document it will find the hyperlinks that, to its judgment, are relevant to its application goal, and it will then issue one of the 4 operations against one of those hyperlinks.

It will get another resource representation in a particular media type as a response.

And so on and so forth until the whole application goal of the client is satisfied.

Please note this could be a major obstacle in practical implementations.

The set of media types used could be open ended (in which case you are almost sure the interaction will fail in cases of unknown media types).

The client complexity will also be very high.

Think of a machine client trying to emulate a human browsing and interacting with the Web.

That is only possible if the client can rely on predefined media types, but even then, it is not a walk in the park.

Communicate statelessly

REST mandates that:.

· A server holds no client context between requests

· Session state is held in the client, or in a database.

· The client sends a request when it is ready to transition to a new state.

· While requests are outstanding, the client is considered to be in transition.

· The representation returned contains links the client may use initiate a new state-transition.

A server component should not have to hold state for any of the clients it communicates with - beyond a single request.

The reasons are:

· scalability: it is difficult to scale out server components that maintain client state.

· loose-coupling: the client is not dependent on talking to the same server component in two consecutive requests

Observations

Chris Britton: A key point in REST is that the server is stateless like a web browser.

We can create a seemingly stateful application by hiding state in the web page or in a cookie.

It needs fast network technology to make this a realistic way of building applications.

When you invoke a service in a fire-and-forget style, you need integration between the message passing middleware and transactions to ensure that one message and only one message is sent and received.

If the sending server aborts the transaction, in most circumstances you want the message to be aborted as well.

Conclusions and remarks

A RESTful HTTP approach to exposing operations is different from RPC, distributed objects and Web services; it takes some mind shift to really understand this difference.

Ugo Corda: I don't have any problem with people defining their own architectural principles.

What I object to is people hijacking the term REST for other purposes.

Graham: We could say the same about other terms:

“EA” is abused as a label for every conceivable design and/or management consulting activity.

“OO” and “SOA” often mean little beyond encapsulation of components behind interfaces.

The meanings of "loose-coupling", "asynchronous", even "component", "service" and "interface" are all disputable.

All these terms can be dangerous in a conversation between architects until they have spent 30 minutes discussing what they don't mean.

Ugo: How true! I have been fighting the SOA battle too.

But in the case of REST there is one advantage: it was defined by a single person, Roy Fielding.

There is a well-defined description (Roy's thesis). Roy’s many blogs clarify the intent of his thesis.

And you can always count on Roy to say "shut up, you don't know what you are talking about" as soon as you stray from the true REST path.

Graham: For me, REST is more building regulations than architecture.

Dion Hinchcliffe says that WOA (he mostly means REST) enables systems to be built from the bottom-up, which it does of course.

Some beautiful systems may emerge from the bottom-up – by evolutionary development.

But you can’t call them architected if there is no architect, no architectural blueprint.

Some align REST with SOA.

They suggest a SOA be built using REST as the building regulations, instead of using of enterprise middleware and/or instead of use Web Services and SOAP.

I recently spent several hours talking to a team leader on a huge (400 person) software project.

They are using a mix of technologies, including middleware in some places and REST in other places.

She didn't talk about technology choices (aside from saying Web Methods is not really an ESB.).

Her concern is getting the modularity right, using Demeter's law among other ideas.

She reported they are finding small components, each with four CRUD operations, seems about right.

She says they are implementing SOA using REST in some places.

Footnote: Creative Commons Attribution-No Derivative Works Licence 2.0 19/01/2019 16:47

Attribution: You may copy, distribute and display this copyrighted work only if you clearly credit “Avancier Limited: http://avancier.co.uk” before the start and include this footnote at the end.

No Derivative Works: You may copy, distribute, display only complete and verbatim copies of this page, not derivative works based upon it.

For more information about the licence, see http://creativecommons.org