Ted Dziuba Archive

The S in REST

18 August 2014

REST is a vast improvement over complex things like SOAP and CORBA, but I think we still have a way to go before we’ve reached simple. REST is an acronym for REpresentational State Transfer, and I think the “state” part of that acronym gives rise to a lot of incidental complexity as systems grow.

You can think of state as a combination of value and time, and in the RESTful case, the time dimension is almost always “now”. The trouble then comes the absence of a coordinated notion of time.

Almost every program I write today depends on at least one RESTful service, and my program is just one component in an ensemble. As we develop systems that call systems that call systems, what are the odds that everybody participating in the ensemble has the same notion of time? What happens when systems have different notions of time? When we are talking about services in the small, it’s not a much of a concern, but for networked systems in the large, conflating value and time into state makes our systems increasingly difficult to reason about.

System Consistency

Here’s a concrete example: let’s suppose we are developing an e-commerce system to process customer orders. We have a RESTful endpoint that looks something like this:

POST http://api.example.com/orders

  "product_id": 12345,
  "customer": {"first_name": "Ted",
               "last_name": "Dziuba",
               "shipping_address": "123 Main St",
               "credit_card": {"type": "visa",
                               "number": "0123456789012345"}

The obvious problem with this is that we’re given a reference to the product, and not the product itself. Therefore, we need to de-reference product_id to know how much to charge the customer in our order processor:

GET http://api.example.com/product/12345

  "id": 12345,
  "title": "Apple iPad Air",
  "price_usd": 599.99

There’s a consistency problem: the price may have changed between the time that we received the order and the time we looked up the product. To patch it up, we attach the product price to the order:

POST http://api.example.com/orders

  "product": {"id": 12345,
              "price_usd": 599.99},
  "customer": "..."

Yet, this is not a robust solution. We must attach any state that we need post-transaction. If our analytics department needs to know the product category for their reporting, or if our accounting department needs to know cost-of-goods-sold for revenue calculations, we must attach these as well, as they may change over time. As these requirements grow, you end up sending more and more of the object graph attached to the order, just to maintain consistency for everything that happens after-the-fact.

While this kind of solution may work in the small, consistency-patching complexity will actually grow faster than the system itself. Every new participant in the system must account for consistency semantics for every other service it depends on. If your system has N components, each of which depends on at least 1, and as many as N-1 other components, the lower bound of this complexity is linear, and the upper bound is quadratic.

REST Conflates Reference and Value

The root cause of this complexity is the “state” part of REST. If somebody hands me a resource URL, like http://api.example.com/product/12345, what can I safely do with it?

  • Can I save it to a database?
  • Can I send it to another service?
  • How do I know if the value changes?

Very rarely will a RESTful URL specify an Expires header, and even when it does, we still cannot pass that URL downstream, since we do not know when downstream processors will fetch the value.

We could string something together using If-Modified-Since and timestamp passing, but this also is not a robust solution.

Representational Value Transfer (REVAT)

What we need is a RESTful way to disambiguate state and value, so that we can reason about resource URLs. I propose that we use use existing REST semantics for indicating state and adopt a new conventional standard for indicating values:

GET http://api.example.com/values/5690ba7f-f308-4c32-b67c-56f654bbfd83

  "id": 12345,
  "title": "Apple iPad Air",
  "price_usd": 599.99

The salient points of a REVAT URL are:

  • REVAT values are immutable.
  • Values are identified by random UUIDs
    • No coordination is required to uniquely generate them
    • A non human-friendly identifier hints that this thing cannot be changed
  • A client may store new values in two ways:
    • POST http://api.example.com/values/ in which case, the server generates a new UUID to identify the value.
    • PUT http://api.example.com/values/<uuid> when the identifier is supplied by the client. Any attempt to PUT or POST to an existing value must fail with 409 Conflict.

Unfortunately, there is no formal way to specify that a resource is immutable, and therefore never-expiring with HTTP headers. According to RFC 2616:

To mark a response as “never expires,” an origin server sends an Expires date approximately one year from the time the response is sent. HTTP/1.1 servers SHOULD NOT send Expires dates more than one year in the future.

This, however, is not a robust solution, so REVAT resource responses, in addition to setting an Expires header one year in the future must also specify the header Immutable. This header has no associated value; its presence indicates that the body of the response is immutable and may be cached indefinitely.

Since the value is immutable, when you receive a REVAT URL, you may do anything you please with it, without any worry over consistency semantics:

  • You may store it indefinitely
  • You may pass it to other services
  • Everybody that sends and receives a REVAT URL is guaranteed to be talking about the same thing.

Think of a REVAT resource URL as a named value, like 𝛑. Nobody can change it and it’s impractical to pass around the fully-realized value, but we have a concise way to refer to it.

Interoperability with REST

REVAT is a complement to REST. We want to affect change in the world, so we need some notion of value-over-time, which we call state, and REST’s domain is that of state. However, as systems grow, we can significantly reduce complexity by reducing the “surface area” of the state.

HTTP has the semantics necessary to square REST with REVAT. Calling GET on a RESTful resource URL can simply 302 Redirect to a REVAT URL, which represents the value of that resource at the time. It may be helpful to think of a RESTful GET as a pointer dereference.

1. A request for a RESTful resource:

GET /product/12345 HTTP/1.1
Host: api.example.com
Accept: application/json

2. Returns a pointer to a REVAT value:

HTTP/1.1 302 Found
Location: http://api.example.com/values/5690ba7f-f308-4c32-b67c-56f654bbfd83
Cache-Control: no-cache

3. Which we can then fetch at our leisure:

GET /values/5690ba7f-f308-4c32-b67c-56f654bbfd83 HTTP/1.1
Host: api.example.com
Accept: application/json

4. And the server always returns the same value:

HTTP/1.1 200 OK
Date: Tue, 19 Aug 2014 16:17:24 GMT
Expires: Tue, 19 Aug 2015 16:17:24 GMT
Content-Type: application/json

  "id": 12345,
  "title": "Apple iPad Air",
  "price_usd": 599.99

If we’re given a REVAT URL, we can safely pass it to downstream services without fetching the value.

This may seem scary at first, as we might ask “How do I know I’m looking at the most current value of the resource?” This is a legitimate question, and the answer is you don’t. However, we do know that if we receive a REVAT URL, the value we fetch from the server is the same value that upstream sent. While it may not be the most current value, every service in the processing chain is dealing with the same thing.

This makes our ensemble of components incredibly robust. For the vast majority of applications, we don’t actually need that much state. It’s reasonable to confine state to one small area of the system, and I’ve found that doing so makes systems far easier to reason about. When a system is easy to reason about, it is easy to change.

Immutability in the small facilitates mutability in the large.


Much of this is built on ideas from the Clojure community. I would highly recommend the following resources: