REST is a protocol-agnostic architecture proposed by Roy Fielding in his dissertation (chapter 5 being the presentation of REST), that generalizes the proven concept of web browsers as clients in order to decouple clients in a distributed system from servers.
In order for a service or API to be RESTful, it must adhere to given constraints like:
Besides the constraints mentioned in Fielding's dissertation, in his blog post REST APIs must be hypertext-driven, Fielding clarified that just invoking a service via HTTP does not make it RESTful. A service should therefore also respect further rules which are summarized as follow:
The API should adhere to and not violate the underlying protocol. Although REST is used via HTTP most of the time, it is not restricted to this protocol.
Strong focus on resources and their presentation via media-types.
Clients should not have initial knowledge or assumptions on the available resources or their returned state ("typed" resource) in an API but learn them on the fly via issued requests and analyzed responses. This gives the server the opportunity to move around or rename resources easily without breaking a client implementation.
The Richardson Maturity Model is a way to apply REST constraints over HTTP in order to obtain RESTful web services.
Leonard Richardson divided applications into these 4 layers:
As the focus is on the representation of the state of a resource, supporting multiple representations for the same resource is encouraged. A representation could therefore present an overview of the resource state while an other returns the full details of the same resource.
Note also that given Fielding constraints, an API is effectively RESTful only once the 3rd level of the RMM is implemented.
An HTTP request is:
An HTTP response is:
HTTP verbs characteristics:
POST
, PUT
, PATCH
GET
GET
(nullipotent), PUT
, DELETE
body safe idempotent
GET ✗ ✔ ✔
POST ✔ ✗ ✗
PUT ✔ ✗ ✔
DELETE ✗ ✗ ✔
PATCH ✔ ✗ ✗
Consequently, HTTP verbs can be compared to the CRUD functions:
Note that a PUT
request asks clients to send the entire resource with the updated values. To partially update a resource, a PATCH
verb could be used (see How to partially update a resource?).
Nothing stops you from adding a body to erroneous responses, to make the rejection clearer for clients. For example, the 422 (UNPROCESSABLE ENTITY) is a bit vague: response body should provide the reason why the entity could not be processed.
Each resource must provide hypermedia to the resources it is linked to. A link is at least composed by:
rel
(for relation, aka name): describes the relation between the main resource and the linked one(s)href
: the URL targeting the linked resource(s)Additional attributes can be used as well to help with deprecation, content negotiation, etc.
Cormac Mulhall explains that the client should decide what HTTP verb to use based on what it is trying to do. When in doubt, the API documentation should anyway help you understanding the available interactions with all hypermedia.
Media types help having self-descriptive messages. They play the part of the contract between clients and servers, so that they can exchange resources and hypermedias.
Although application/json
and application/xml
are quite popular media-types, they do not contain much semantics. They just describe the overall syntax used in the document. More specialized media-types that support the HATEOAS requirements should be used (or extended through vendor media types), such as:
A client tells a server which media types it understands by adding the Accept
header to his request, for example:
Accept: application/hal+json
If the server isn't able to produce requested resource in such a representation, it returns a 406 (NOT ACCEPTABLE). Otherwise, it adds the media type in the Content-Type
header of the response holding the represented resource, for example:
Content-Type: application/hal+json
A stateful server implies that the clients sessions are stored in a server-instance-local storage (almost always in web server sessions). This starts to be an issue when trying to scale horizontally: if you hide several server instances behind a load balancer, if one client is first dispatched to instance #1 when signing in, but afterwards to instance #2 when fetching a protected resource for example, then the second instance will handle the request as an anonymous one, as the client session has been stored locally in instance #1.
Solutions have been found to tackle this issue (e.g. by configuring session replication and/or sticky session), but the REST architecture proposes another approach: just don't make you server stateful, make it stateless. According to Fielding:
Each request from client to server must contain all of the information necessary to understand the request, and cannot take advantage of any stored context on the server. Session state is therefore kept entirely on the client.
In other words, a request must be handled exactly the same way, regardless of whether it is dispatched to instance #1 or instance #2. This is why stateless applications are considered easier to scale.
A common approach is a token-based authentication, especially with the trendy JSON Web Tokens. Note that JWT still have some issues though, particularly concerning invalidation and automatic prolongation of expiration (i.e. the remember me feature).
Using cookies or headers (or anything else) has nothing to do with whether the server is stateful or stateless: these are just media that are here used to transport tokens (session identifier for stateful servers, JWT, etc.), nothing more.
When a RESTful API is only used by browsers, (HttpOnly and secure) cookies can be quite convenient as browsers will automatically attach them to outgoing requests. It's worth mentioning that if you opt for cookies, be aware of CSRF (a nice way of preventing it is to have the clients generate and send the same unique secret value in both a cookie and a custom HTTP header).
Last-Modified
headerThe server can provide a Last-Modified
date header to the responses holding resources that are cacheable. Clients should then store this date together with the resource.
Now, each time clients request the API to read the resource, they can add to their requests an If-Modified-Since
header containing the latest Last-Modified
date they received and stored. The server has then to compare the request's header and the actual last modified date of the resource. If they are equal, the server returns a 304 (NOT MODIFIED) with an empty body: the requesting client should use the currently cached resource it has.
Also, when clients request the API to update the resource (i.e. with an unsafe verb), they can add an If-Unmodified-Since
header. This helps dealing with race conditions: if the header and the actual last modified date are different, the server returns a 412 (PRECONDITION FAILED). The client should then read the new state of the resource before retrying to modify the resource.
ETag
headerAn ETag (entity tag) is an identifier for a specific state of a resource. It can be a MD5 hash of the resource for a strong validation, or a domain-specific identifier for a weak validation.
Basically, the process is the same as with the Last-Modified
header: the server provides an ETag
header to the responses holding resources that are cacheable, and clients should then store this identifier together with the resource.
Then, clients provide an If-None-Match
header when they want to read the resource, containing the latest ETag they received and stored. The server can now return a 304 (NOT MODIFIED) if the header matches the actual ETag of the resource.
Again, clients can provide an If-Match
header when they want to modify the resource, and the server has to return a 412 (PRECONDITION FAILED) if the provided ETag doesn't match the actual one.
If clients provide both date and ETag in their requests, the date must be ignored. From RFC 7232 (here and here):
A recipient MUST ignore
If-Modified-Since
/If-Unmodified-Since
if the request contains anIf-None-Match
/If-Match
header field; the condition inIf-None-Match
/If-Match
is considered to be a more accurate replacement for the condition inIf-Modified-Since
/If-Unmodified-Since
, and the two are only combined for the sake of interoperating with older intermediaries that might not implementIf-None-Match
/If-Match
.
Also, while it's quite obvious that the last modified dates are persisted along with the resources server-side, several approaches are available with ETag.
A usual approach is to implement shallow ETags: the server processes the request as if no conditional headers were given, but at the very end only, it generates the ETag of the response it is about to return (e.g. by hashing it), and compares it with the provided one. This is relatively easy to implement as only an HTTP interceptor is needed (and many implementations already exist depending on the server). That being said, it's worth mentioning that this approach will save bandwidth but not server performance:
A deeper implementation of the ETag mechanism could potentially provide much greater benefits – such as serving some requests from the cache and not having to perform the computation at all – but the implementation would most definitely not be as simple, nor as pluggable as the shallow approach described here.
HTTP is not RPC: what makes HTTP significantly different from RPC is that the requests are directed to resources. After all, URL stands for Uniform Resource Locator, and a URL is a URI: a Uniform Resource Idenfitier. The URL targets the resource you want to deal with, the HTTP method indicates what you want to do with it. HTTP methods are also known as verbs: verbs in URLs makes then no sense. Note that HATEOAS relations shouldn't contain verbs neither, as links are targeting resources as well.
As PUT
requests ask clients to send the entire resource with the updated values, PUT /users/123
cannot be used to simply update a user's email for example. As explained by William Durand in Please. Don't Patch Like An Idiot., several REST-compliant solutions are available:
PUT
method to send an updated value, as the PUT
specification states that partial content updates are possible by targeting a separately identified resource with state that overlaps a portion of the larger resource:PUT https://example.com/api/v1.2/users/123/email
body:
[email protected]
PATCH
request that contains a set of instructions describing how the resource must be modified (e.g. following JSON Patch):PATCH https://example.com/api/v1.2/users/123
body:
[
{ "op": "replace", "path": "/email", "value": "[email protected]" }
]
PATCH
request containing a partial representation of the resource, as proposed in Matt Chapman's comment:PATCH https://example.com/api/v1.2/users/123
body:
{
"email": "[email protected]"
}
Quoting Vinay Sahni in Best Practices for Designing a Pragmatic RESTful API:
This is where things can get fuzzy. There are a number of approaches:
Restructure the action to appear like a field of a resource. This works if the action doesn't take parameters. For example an activate action could be mapped to a boolean
activated
field and updated via a PATCH to the resource.Treat it like a sub-resource with RESTful principles. For example, GitHub's API lets you star a gist with
PUT /gists/:id/star
and unstar withDELETE /gists/:id/star
.Sometimes you really have no way to map the action to a sensible RESTful structure. For example, a multi-resource search doesn't really make sense to be applied to a specific resource's endpoint. In this case,
/search
would make the most sense even though it isn't a resource. This is OK - just do what's right from the perspective of the API consumer and make sure it's documented clearly to avoid confusion.
API is documented. Tools are available to help you building your documentation, e.g. Swagger or Spring REST Docs.
API is versioned, either via headers or through the URL:
https://example.com/api/v1.2/blogs/123/articles
^^^^
https://example.com/api/v1.2/blogs/123/articles
^^^^^ ^^^^^^^^
https://example.com/api/v1.2/quotation-requests
^^^^^^^^^^^^^^^^^^
{
...,
_links: {
...,
self: { href: "https://example.com/api/v1.2/blogs/123/articles/789" }
^^^^
}
}
{
...,
_links: {
...,
firstPage: { "href": "https://example.com/api/v1.2/blogs/123/articles?pageIndex=1&pageSize=25" }
^^^^^^^^^
}
}