HTTP is not simple

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

HTTP/1 may appear simple because of several reasons: it is readable text, the most simple use case is not overly complicated and existing tools like curl and browsers help making HTTP easy to play with.

The HTTP idea and concept can perhaps still be considered simple and even somewhat ingenious, but the actual machinery is not.

[goes on to describe several specific characteristics of HTTP that make it un-simple, under the headings:

  • newlines
  • whitespace
  • end of body
  • parsing numbers
  • folding headers
  • never-implemented
  • so many headers
  • not all methods are alike
  • not all headers are alike
  • spineless browsers
  • size of the specs

]

I discovered this post late, while catching up on posts in the comp.infosystems.gemini newsgroup, but I’m glad I did because it’s excellent. Daniel Stenberg is, of course, the creator of cURL and so probably knows more about the intricacies of HTTP than virtually anybody (including, most-likely, some of the earliest contributors to its standards), and in this post he does a fantastic job of dissecting the oft-made argument that HTTP/1 is a “simple” protocol; based usually upon the argument that “if a human can speak it over telnet/netcat/etc., it’s simple”.

This argument, of course, glosses over the facts that (a) humans are not simple, and the things that we find “easy”… like reading a string of ASCII representations of digits and converting it into a representation of a number… are not necessarily easy for computers, and (b) the ways in which a human might use HTTP 0.9 through 1.1 are rarely representative of the complexities inherent in more-realistic “real world” use.

Obviously Daniel’s written about Gemini, too, and I agree with some of his points there (especially the fact that the specification intermingles the transfer protocol and the recommended markup language; ick!). There’s a reasonable rebuttal here (although it has its faults too, like how it conflates the volume of data involved in the encryption handshake with the processing overhead of repeated handshakes). But now we’re going way down the rabbithole and you didn’t come here to listen to me dissect arguments and counter-arguments about the complexities of Internet specifications that you might never use, right? (Although maybe you should: you could have been reading this blog post via Gemini, for instance…)

But if you’ve ever telnet’ted into a HTTP server and been surprised at how “simple” it was, or just have an interest in the HTTP specifications, Daniel’s post is worth a read.