Was this article helpful?

Why HTTP Isn't A Transport Protocol

Last modified 06:36, 31 Oct 2008
Table of Contents
No headers

One approach that is in violation with the REST style is to treat HTTP as a Transport Layer.

The arguments in favour of using HTTP as a Transport Layer vary from culprit to culprit.

Some are well aware that they are using an application layer as a transport layer but argue that this does what they want it to do. Often they see agnosticism to the layer below as an advantage. Agnosticism is a valid goal, it offers a useful layer of abstraction and HTTP itself is agnostic to its transport layer for this very reason. But agnosticism is only useful if what you are being agnostic towards tolerates this and doesn't expect you to interact with it. Transport layers are built to offer this tolerance, application layers aren't. Replace the word "agnostic" with "ignorant" and see how good it sounds – immediately the question arises "am I being ignorant of something I shouldn't care about, or something I should care about?"

On the other hand some people think that HTTP is a transport layer. It's sometimes even expanded as "Hypertext Transport Protocol" rather than the correct "Hypertext Transfer Protocol". At first gloss this is reasonable, we've something over there we want to get over here or vice versa and therefore we transport it. However we do more than that when we transfer something.

In the vernacular, there is very little difference between "transfer" and "transport", though the former conveys a greater degree of attention to what is being dealt with than the latter. You might transport a colleague by giving her a lift into town. If you were transporting young children into town you would probably also want to make sure they were accompanied by you until such a time that they were safely transferred to the care of another responsible adult. Another case where we often use the word "transfer" is in banking. I don't want my bank to merely transport my money to the correct branch, I want them to make sure it goes into the correct account, that the deposit is recorded and so on.

Similarly HTTP doesn't just transport. Generally we consider that the job of the TCP protocol that HTTP is running on top of (or another protocol – the most common example is SSL, which provides services beyond transport in encrypting the data, but which actually *is* intended to be used as a transport layer).

HTTP transfers rather than transports because it pays attention to what it is dealing with. Almost all the features of HTTP would be different if it was a transport protocol:

Verbs: If HTTP was a transport protocol it would probably have either only GET and PUT (redefined to mean "transport that here" and "transport this there"). Alternatively it would only have POST (redefined to mean "transport this [possibly null] thing there and then transport that [possibly null] thing here).

Status codes: HTTP would really only need 3 codes:

200: IT WORKED

500: SORRY MY BAD

400 YOUR BAD

Resource Metadata: As transport protocol HTTP wouldn't know anything about what something was, what it did or what it was for. Therefore there'd be no resource headers (indeed, the very concept of resource in HTTP can't exist with just a transport protocol).

Control Metadata: As a transport protocol HTTP doesn't have enough knowledge about something to say how long it can be cached, where it came from, whether there are other versions or anything else.

Representation Metadata: The protocol might still know, e.g. when a file was last-modified, so we might still have these.

This is a very simple protocol. It can be since it's a transport protocol built on a transport protocol (TCP, etc.) that does all the real work.

So what have we lost in turning HTTP into a transport protocol?

We lost some of our verbs firstly. Maybe this is a good thing – REST wants a small set of verbs and we've just made it smaller. However let's look at the nuances of each of these verbs:

GET I want to look at that thing.
I'm not going to touch it, just look at it. You can rest assure I won't do anything with it.
If it's the same as it was the last time I looked at it, that's all I need to know.
If it's the same as it was the last time my friend looked at it, just tell him to tell me about it, he's closer so that's less hassle for all of us.
I don't care if you just told me about it and I'm asking again, it's not like looking twice can do any harm, is it?
OPTIONS What can I do with that thing?
POST Take this thing and apply it to that thing.
This may do all manner of stuff to that thing or to other things, I promise nothing.
HEAD What would it be like if I tried to look at that thing?
PUT Replace that thing with this thing. You're free to do this in the manner that makes sense to you. I've told my friend here delivering this request that he shouldn't expect that thing to be the same as it used to be.
DELETE Get rid of that thing.
TRACE Testing, Testing, 1, 2.
CONNECT Let's do something else instead of HTTP.

There's quite a lot to this.

An important point is that all of these have well-defined guarantees and promises. GETting has certain guarantees about safety (a client is ensured that it can't be doing anything harmful if it GETs, because it's only "looking").

If a client PUTs something then anyone along the route taken knows that what it used to know about the resource has changed. How it's changed is another matter (the server may have been "clever" in how it dealt with the PUT) but all past assumptions can be discarded.

POST can do anything and has to be treated as something that can do anything. Doing anything means it can be used to cover the job of GET, PUT and DELETE, but it doesn't have the guarantees and implicit information they have. It's our swiss-knife, but it's not as good as the other tools we have for the jobs we can do.

Status Codes

Status codes break down roughly into "FYI", "That worked", "Do this instead", "Problem your end" and "Problem my end"

In finer detail they offer a lot. Instead of a single status code for "That worked, take a look at this" we have "That worked, take a look at this", "Okay, I made that for you", "Okay, I'll work on that later", "That worked", "That worked, here's the bit you care about".

That tells us a lot more than mere success.

The differences between the redirect codes are more often of great importance. In particular they tell us whether we should use the same verb or a different one (a server can do something based on a POST and then tell the client to GET something to see the results), whether we can always just look at the second URI or whether we should keep checking the first one, and possibly how long we can use the second one instead of the first one for.

The "You made a mistake" covers a lot of different types of problem from "never heard of it" to "who are you to be asking?" A lot are very useful in the same way that good error messages are useful, but even more are useful in telling you what to do about something (e.g. that the item in question isn't there any more, or that you have to provide authentication details).

Finally, the "Problem my end" status may help you solve a problem, or know when things are likely to be fixed.

Control MetadataBecause HTTP isn't "dumb" about what it's transferring it can tell you a lot about how to handle something, "it'll still be the same in 5 minutes time", whether you actually need it "it's the same as it was last time you asked" or even "here's the version of it you want".

Representation Metadata

Tied to this, HTTP can tell you a lot about what sort of thing you've got. This is actually pretty vital (in the case of commonly browsed to pages, browsers need to know whether you've got a PNG, an HTML page or a type of file that you don't know what to do with and can only save - file extensions doesn't scale to a world full of different machines running different OSs with different set-ups). But it can also be useful beyond the bare requirements (when it was last saved, so you can do your part in the "I got it 5 minutes ago" exchange).

Resource Metadata

Here's a big one. What you get back from any given request may not be the only possible result. The web isn't a file system. What's at http://example.net/foo isn't a foo file, it's a foo resource. Now this may well be implemented by a simple file that always gets sent back, or a simple script that always says the same thing back, but if two people have different ways they want to look at foos they can ask for different ways of representing a foo – one gets a webpage in English and the other gets a sound file in French.

Of course the server needs to know what sort of things the clients like to deal with, and the clients can benefit from knowing that there are alternatives (intermeditaries even more so, as this tells them whether what they've already seen will fit the bill or not).

All of this is a lot more than simple transport.

It's got meaning, it's got control, it's got safety, it's got efficiency, and it's checks that ensure that efficiency isn't due to cutting corners that shouldn't be cut.

We don't just get something from A to B. We get something from A to B in the right way, for the right thing, as quickly as possible, and with the degree of care that is required.

Was this article helpful?
Pages that link here
Page statistics
11072 view(s) and 4 edit(s)
Social share
Share this page?

Tags

This page has no custom tags set.

Comments

You must to post a comment.

Attachments