HTTP/1 may appear simple because of several reasons: it is readable text, the most simple use case is not overly complicated and existing tools like curl and browsers help making
HTTP easy to play with.
The HTTP idea and concept can perhaps still be considered simple and even somewhat ingenious, but the actual machinery is not.
[goes on to describe several specific characteristics of HTTP that make it un-simple, under the headings:
newlines
whitespace
end of body
parsing numbers
folding headers
never-implemented
so many headers
not all methods are alike
not all headers are alike
spineless browsers
size of the specs
]
I discovered this post late, while catching up on posts in the comp.infosystems.gemini newsgroup,
but I’m glad I did because it’s excellent. Daniel Stenberg is, of course, the creator of cURL and so probably knows more about the intricacies of HTTP
than virtually anybody (including, most-likely, some of the earliest contributors to its standards), and in this post he does a fantastic job of dissecting the oft-made argument that
HTTP/1 is a “simple” protocol; based usually upon the argument that “if a human can speak it over telnet/netcat/etc., it’s simple”.
This argument, of course, glosses over the facts that (a) humans are not simple, and the things that we find “easy”… like reading a string of ASCII representations of digits and
converting it into a representation of a number… are not necessarily easy for computers, and (b) the ways in which a human might use HTTP 0.9 through 1.1 are rarely representative of
the complexities inherent in more-realistic “real world” use.
Obviously Daniel’s written about Gemini, too, and I agree with some of his points there (especially the fact that the
specification intermingles the transfer protocol and the recommended markup language; ick!). There’s a reasonable rebuttal here (although it has its faults too, like how it conflates the volume of data involved in the
encryption handshake with the processing overhead of repeated handshakes). But now we’re going way down the rabbithole and you didn’t come here to listen to me dissect
arguments and counter-arguments about the complexities of Internet specifications that you might never use, right? (Although maybe you should: you could have been reading this blog post via Gemini, for instance…)
But if you’ve ever telnet’ted into a HTTP server and been surprised at how “simple” it was, or just have an interest in the HTTP specifications, Daniel’s post is worth a read.
I’m writing up this information because Dreamwidth’s legal advocacy work has been largely underrecognized. I have not seen a scrap of mainstream news coverage out there that delves
into the unique role that Dreamwidth has played in the NetChoice lawsuits, and in a tech news landscape that inspires so much resignation and despair, I think people deserve to know
about how Dreamwidth is putting up a fight. Not only does Dreamwidth refuse to engage in intrusive tracking, it’s proactively participating in lawsuits against state governments
that try to force its hand. For all that many lawmakers are trying to make the web worse, Dreamwidth is leveraging itself as proof that a better web is possible.
…
If your mental model of Dreamwidth is “it’s like LiveJournal, but…” then you owe it to yourself to read Coyote’s excellent explanation of how
Dreamwidth is so much more: a beacon of privacy-centric and censorship-resistant blog hosting in a world that increasingly seems at-best uninterested and at-worst actively
hosting to such things.
That it’s not for me personally (I’m more a selfhost type) doesn’t mean it’s not a great choice for you: it’s got solid free and reasonably-priced premium tiers and all the kinds of
features you’d expect from a service live LiveJournal, or Tumblr, or Medium… but without all of the antifeatures that come with each of those.
And yeah, they’re on the side of the good guys:
I think there’s mileage in stealing repurposing this iconic line…
Coyote also wrote the excellent You Can Make A Website, if you’re looking for further reading from
the same author.
A couple of weeks ago I blogged about setting up a PO Box and adding postal mail to the ways you can
contact me. I went for a “pay as you go” PO Box because I didn’t know if anybody would actually use it, but I’ve already received two delightful postcards and I couldn’t be more
thrilled.
The PO Box worked very well: I’m using UK Postbox principally because of their “pay as you go” rate (with a free tier in case you don’t receive
any mail at all, which I figured was a risk) but I was later pleased to discover they’re a nice company in other ways,
too. They scan the outside/one side of my mail as it arrives and I can optionally pay to scan the whole thing and/or to bundle and forward it on to me3.
I’ve started a new page to collect all the cards, including a (hopefully pretty-accessible) CSS-powered interactive “flipper” so you can turn them over, and
I’m hopeful that I might attract a few more as time goes on. Getting physical mail from “Internet friends” helps make the digital world feel a little bit smaller, and I
love it.
1 Florence’s RSS feed was missing a <![CDATA[ ... ]]> block around some
embedded HTML, which was causing the HTML to be evaluated “as if” it were XML, which – not being XHTML – it failed to do.
2 My suggestion was a variation of Derek Dingle’s Too Many Cards that I’ve been performing all over the place: it’s an immensely satisfying trick to perform, requiring a challenging but achievable set of sleights and
suitable to do without preparation and using a borrowed deck, which is pretty much the gold standard in card magic.
3 I’ve opted to have it forwarded: I’m wondering if I can combine all the postcards I get
into a single poster frame or something: maybe a double-sided one so the whole thing can be flipped to show the text, not just the fronts?
Subject: “Re-Design and Promotion Strategy for Dead.Garden”
Subject: “About your Dead.Garden”
Subject: “Errors in your Dead.Garden”
Dear Dead,
your website is not good enough, in fact, it is actively bad.
Don’t you know that you need Search Engine Optimization?
What are you, some kind of idiot?
Your site is currently ranked on page 1,000,000 of Google,
and if we know anything (in fact, we know everything),
this means that you are wasting not only your time,
but much more importantly
money.
We’ve had a quick look at your site
and noticed a few areas that could be improved.
We’ve discovered that your website’s UI is,
frankly,
complete ass.
Your mobile experience is bad, your CTAs should be shinier and rounder;
Maybe put a gradient here and there.
How are you ever going to get someone to buy your product
without manipulating their behaviour?
You’re not selling anything?
Well then, what ARE you doing?
…
A fantastic poem that feels exactly like the subtext of every one of these emails I ever receive.
My blog is for me, first and foremost; I suspect Jo feels a similar way about their digital garden. I’m not interested in making money
with it, and I’m perfectly comfortable with the fact that it costs me money. These things are all fine. I don’t need an SEO merchant to tell me how they can improve it.
It makes me sad to see the gradual disappearance of the contact form from personal websites. They generally feel more convenient than email addresses, although this is
perhaps part of the reason that they come under attack from spammers in the first place! But also, they provide the potential for a new and different medium: the comments
area (and its outdated-but-beautiful cousin the guestbook).
Comments are, of course, an even more-obvious target for spammers because they can result in immediate feedback and additional readers for your message. Plus – if they’re allowed to
contain hyperlinks – a way of leeching some of the reputability off a legitimate site and redirecting it to the spammers’, in the eyes of search engines. Boo!
Well this was painful to write.
But I’ve got to admit: there have been many times that I’ve read an interesting article and not interacted with it simply because the bar to interaction (what… I have
to open my email client!?) was too high. I’d prefer to write a response on my blog and hope that webmention/pingback/trackback do their thing, but will they? I don’t know in
advance, unless the other party says so openly or I take a dive into their source code to check.
Your Experience May Vary
I’ve had both contact/comment forms and exposed email addresses on my website for many years… and I feel like I get aproximately the same amount
of spam on both, after filtering. The vast majority of it gets “caught”. Here’s what works for me:
My contact/comments forms use one of a variety of unobtrustive “honeypot”-style traps. These “reverse CAPTCHAs” attempt to trick bots into interacting with them in some
particular way while not inconveniencing humans.
Antispam Bee provides the first line of defence, but I’ve got a few tweaks of my own to help counteract the efforts of
determined spammers.
Once you’ve fallen into a honeypot it becomes much easier to block subsequent contacts with the same/similar content, address, (short-term) IP, or the poisoned cookie you’re given.
Keyword filtering provides a further line of defence. E.g. for contact forms that post directly back to the Web (i.e. comment forms, and perhaps a future guestbook form), content
with links goes into a moderation queue unless it shares a sender email with a previously-approved sender. For contact forms that result in an email, I’ve just got a few “scorer” rules
relating to geo IP, keywords, number and density of links, etc. that catch the most-insidious of spam to somehow slip through.
I also publish email addresses all over the place, but they’re content-specific. Like Kev, I anticipated spam and so use unique email addresses on
different pieces of content: if you want to reply-by-email to this post, for example, you’re encouraged to use the address
b27404@danq.me. But this approach has actually provided secondary benefits that are more-valuable:
The “scrapers” that spam me by email would routinely send email to multiple different @danq.me addresses at the same time. Humans don’t send the same identical message
to me to different addresses published on my site and from different senders, so my spam filter picks up on this rightaway.
As a fringe benefit, this helps me determine the topic on an email where it’s unclear. E.g. I’ve had humans email me to say “I tried to follow the guide on your page but it didn’t
work for me” and I wouldn’t have had a clue which page had they not reached out via a page-specific email alias.
I enjoy the potential offered by rotating the email address generation mechanism and later treating all previously-exposed addresses as email honeypots.
They’ve all got different “sender” addresses, but that fact that this series of emails were identical except for the different recipient aliases meant that catching them was very easy
for my spam filters.
Works For Me!
This strategy works for me: I get virtually no comment/contact form spam (though I do occasionally get a false positive and a human gets blocked as-if they were a robot), and very
little email spam (after my regular email filters have done their job, although again I sometimes get false positives, often where humans choose their subject lines poorly).
It might sound like my approach is complicated, but it’s really not. Adding a contact form honeypot is not significantly more-difficult than exposing automatically-rotating email
aliases, and for me it’s worth it: I love the convenience and ease-of-use of a good contact/comments form, and want to make that available to my visitors too!
(I also allow one-click reactions with emoji: did you see? Scroll down and send me a bumblebee! Nobody seems to have found a way to spam me with these, yet: it’s not a very expressive
medium, I guess!)
This week, I spent two days on a shoestring Internet connection, and it was pretty shit.
I’m not saying that these telecomms engineers, who were doing something in some of the nearby utility cabinets at the very moment our Internet connection dropped, were
responsible… but it’d make an amusing irony of their company name – Zero Loss – if they were.
As you might anticipate, we run a complicated network at our house, and so when my connection dropped a quarter of an hour into the beginning of three and a half hours of scheduled
meetings on a busy afternoon, my first thought was to check that everything was working locally. Internal traffic all seemed to be going the right way, so then I checked the primary
router and discovered that the problem was further upstream. I checked our fibre terminator, and sure enough: it said it wasn’t getting a signal.
I checked the status page for our ISP – no reported problems. So I called them up. I was pleased that (after I relayed what tests I’d done so far) they treated me like a network
specialist rather than somebody who needed hand-holding and we skipped over the usual “have you tried turning it off and on again” and got straight to some diagnosis and scheduling an
engineer for the next day. That’d do.
Our village has pretty weak cellular reception, and what little there is struggles to penetrate our walls, some of which are made of stone. And so for a little while, “leaning out of
the window” was the only way to get Internet access while (mostly) dodging the rain.
The end of a workday being ruined was a bit of a drag, but for Ruth it was definitely worse, as she was overseeing a major project the
following morning (from 5am!) and so needed to arrange for emergency out-of-hours access to her office for the next day to be able to make it work. As for me: I figured I’d be back
online by lunchtime, and working a little into the evening would give me a rare opportunity for an increased overlap with my team – many of which are on the Pacific coast of the US – so
it’d all work out.
The engineer arrived the next morning, just as a storm hit. He traced the problem, waited for the rain to ease off, then stomped off up the street to get it fixed. Only a matter of time
now, I thought.
But nope: he came back to say that wherever the fault had occurred was found somewhere under the road that he couldn’t access by himself: it’d need a team of two engineers
to get down there and fix it, and they wouldn’t be able to come… until tomorrow.
So I went up to the attic to work, which is just about the only place in the house where – by balancing my phone against a window – I can consistently tether at 4G/5G. Well…
semi-consistently. Inconsistently enough to be immensely frustrating.
Earlier efforts to tether from downstairs were even less successful.
There’s this thing, I find: no Internet access is annoying, but tolerable.
Slow Internet access is similar.
But intermittent Internet access is, somehow, a nightmare. Applications hang or fail in unpredictable ways, their developers not having planned for the possibility that the
connection they detected for when they were opened might come and go at random. Shitty modern “web applications” that expect to download multiple megabytes of JavaScript before they
work show skeleton loaders and dehydrated <div>s that might one day grow up to be something approximating a button, link, or image. It’s just generally a pretty crap
experience.
It’s funny how we got so dependent upon the Internet. 26+ years ago, I used to
write most of my Web-destined PHP and Perl code “offline”! I’d dial-up to the Internet to download documentation or upload code, then work from my memory, from books, and what I’d
saved from the Web. Can you imagine asking a junior Web developer to do that today?
The second team of engineers were fortunate enough to arrive on a less-torrential day.
In a second ironic twist, a parcel arrived for me during our downtime which contained new network hardware with which I planned to eliminate a couple of WiFi weak spots at the edges of
our house. The new hardware worked perfectly and provided a wonderful improvement to signal strength between our computers… but of course not to computers outside of the
network.
There’s another interesting thing that’s changed over the decades. When I first started installing (bus!) networks, there was no assumption that the network would necessarily
provide Internet access. The principal purpose of the network was to connect the computers within the LAN to one another. This meant that staff could access one another’s
files more easily and make use of a shared printer without walking around carrying floppy disks, for example… or could frag one another at Doom and Quake at the LAN parties that I’d
sometimes run from my mum’s living room!
But nowadays, if you connect to a network (whether wired or wireless) there’s an expectation that it’ll provide Internet access. So much so, that if you join a wireless
network using your mobile phone and it doesn’t provide Internet access, your phone probably won’t route any traffic across it unless you specifically say that it should.
That’s a reasonable default, these days, but it’s an annoyance when – for example – I wanted my phone to continue using Syncthing to back up my photos
to my NAS even though the network that my NAS was on would no longer provide Internet access to my phone!
The second team of engineers quickly found and repaired a break in the fibre – apparently it was easier than the first engineer had expected – and normalcy returned to our household.
But for a couple of days, there, I was forcibly (and unpleasantly) reminded about how the world has changed since the time that “being on a network” wasn’t assumed to be
synonymous with “has Internet access”.
Today, for the first time ever, I simultaneously published a piece of content across five different media: a Weblog post, a video essay, a podcast episode, a Gemlog post, and a
Spartanlog post.
Must be about something important, right?
Nope, it’s a meandering journey to coming up with a design for a £5 coin that will never exist. Delightfully pointless. Being the Internet I want to see in the world.
A freaking excellent longread by Eevee (Evelyn Woods), lamenting the direction of popular technological progress and general enshittification of creator culture. It’s ultimately
uplifting, I feel, but it’s full of bitterness until it gets there. I’ve pulled out a couple of highlights to try to get you interested, but you should just go and read the entire thing:
…
And so the entire Web sort of congealed around a tiny handful of gigantic platforms that everyone on the fucking planet is on at once. Sometimes there is some sort of
partitioning, like Reddit. Sometimes there is not, like Twitter.
That’s… fine, I guess. Things centralize. It happens. You don’t get tubgirl spam raids so much any more, at least.
But the centralization poses a problem. See, the Web is free to look at (by default), but costs money to host. There are free hosts, yes, but those are for static
things getting like a thousand visitors a day, not interactive platforms serving a hundred million. That starts to cost a bit. Picture logs being shoveled into a steam
engine’s firebox, except it’s bundles of cash being shoveled into… the… uh… website hole.
…
I don’t want to help someone who opens with “I don’t know how to do this so I asked ChatGPT and it gave me these 200 lines but it doesn’t work”. I don’t want to know how much code
wasn’t actually written by anyone. I don’t want to hear how many of my colleagues think Whatever is equivalent to their own output.
…
I glimpsed someone on Twitter a few days ago, also scoffing at the idea that anyone would decide not to use the Whatever machine. I can’t remember exactly what they said,
but it was something like: “I created a whole album, complete with album art, in 3.5 hours. Why wouldn’t I use the make it easier machine?”
This is kind of darkly fascinating to me, because it gives rise to such an obvious question: if anyone can do that, then why listen to your music? It takes a
significant chunk of 3.5 hours just to listen to an album, so how much manual work was even done here? Apparently I can just go generate an endless stream of stuff of the
same quality! Why would I want your particular brand of Whatever?
Nobody seems to appreciate that if you can make a computer do something entirely on its own, then that becomes the baseline.
…
Do things. Make things. And then put them on your website so I can see them.
Clearly this all ties in to stuff that I’ve been thinking, lately. Expect more
posts and reposts in this vein, I guess?
Do you remember when your domestic ISP – Internet Service Provider – used to be an Internet Services Provider? They
were only sometimes actually called that, but what I mean is: when ISPs provided more than one Internet service? Not just connectivity, but… more.
One of the first ISPs I subscribed to had a “standard services” list longer than most modern ISPs complete services list!
ISPs twenty years ago
It used to just be expected that your ISP would provide you with not only an Internet connection, but also some or all of:
I don’t remember which of my early ISPs gave me a free license for HoTMetaL Pro, but I was very appreciative of it at the time.
ISPs today
The ISP I hinted at above doesn’t exist any more, after being bought out and bought out and bought out by a series of owners. But I checked the Website of the current owner to see what
their “standard services” are, and discovered that they are:
Optional 4G backup connectivity (for an extra fee)
A voucher for 3 months access to a streaming service3
The connection is faster, which is something, but we’re still talking about the “baseline” for home Internet access then-versus-now. Which feels a bit galling, considering that (a)
you’re clearly, objectively, getting fewer services, and (b) you’re paying more for them – a cheap basic home Internet subscription today, after accounting
for inflation, seems to cost about 25% more than it did in 2000.4
Are we getting a bum deal?
Not every BBS nor ISP would ever come to support the blazing speeds of a 33.6kbps modem… but when you heard the distinctive scream of its negotiation at close to the Shannon Limit of
the piece of copper dangling outside your house… it felt like you were living in the future.
Would you even want those services?
Some of them were great conveniences at the time, but perhaps not-so-much now: a caching server, FTP site, or IRC node in the building right at the end of my
dial-up connection? That’s a speed boost that was welcome over a slow connection to an unencrypted service, but is redundant and ineffectual today. And if you’re still using a
fax-to-email service for any purpose, then I think you have bigger problems than your ISP’s feature list!
Some of them were things I wouldn’t have recommend that you depend on, even then: tying your email and Web hosting to your connectivity provider traded
one set of problems for another. A particular joy of an email address, as opposed to a postal address (or, back in the day, a phone number), is that it isn’t tied to where
you live. You can move to a different town or even to a different country and still have the same email address, and that’s a great thing! But it’s not something you can
guarantee if your email address is tied to the company you dial-up to from the family computer at home. A similar issue applies to Web hosting, although for a true traditional “personal
home page”: a little information about yourself, and your bookmarks, it would be fine.
But some of them were things that were actually useful and I miss: honestly, it’s a pain to have to use a third-party service for newsgroup
access, which used to be so-commonplace that you’d turn your nose up at an ISP that didn’t offer it as standard. A static IP being non-standard on fixed connections is a sad reminder
that the ‘net continues to become less-participatory, more-centralised, and just generally more watered-down and shit: instead of your connection making you “part of” the Internet,
nowadays it lets you “connect to” the Internet, which is a very different experience.5
A page like this used to be absolutely standard on the Website6
of any ISP worth its salt.
Yeah, sure, you can set up a static site (unencumbered by any opinionated stack) for free on Github Pages, Neocities, or wherever, but the barrier to entry has been raised
by just enough that, doubtless, there are literally millions of people who would have taken that first step… but didn’t.
And that makes me sad.
Footnotes
1 ISP-provided shared FTP servers would also frequently provide locally-available copies
of Internet software essentials for a variety of platforms. This wasn’t just a time-saver – downloading Netscape Navigator from your ISP rather than from half-way across the world was
much faster! – it was also a way to discover new software, curated by people like you: a smidgen of the feel of a well-managed BBS, from the comfort of your local ISP!
2 ISP-provided routers are, in my experience, pretty crap 50% of the time… although
they’ve been improving over the last decade as consumers have started demanding that their WiFi works well, rather than just works.
3 These streaming services vouchers are probably just a loss-leader for the streaming
service, who know that you’ll likely renew at full price afterwards.
4 Okay, in 2000 you’d have also have had to pay per-minute for the price of the
dial-up call… but that money went to BT (or perhaps Mercury or KCOM), not to your ISP. But my point still stands: in a world where technology has in general gotten cheaper
and backhaul capacity has become underutilised, why has the basic domestic Internet connection gotten less feature-rich and more-expensive? And often with worse
customer service, to boot.
5 The problem of your connection not making you “part of” the Internet is multiplied if
you suffer behind carrier-grade NAT, of course. But it feels like if we actually cared enough to commit to rolling out IPv6 everywhere we could obviate the need for that particular
turd entirely. And yet… I’ll bet that the ISPs who currently use it will continue to do so, even as the offer IPv6 addresses as-standard, because they buy into their own idea that
it’s what their customers want.
6 I think we can all be glad that we no longer write “Web Site” as two separate words, but
you’ll note that I still usually correctly capitalise Web (it’s a proper noun: it’s the Web, innit!).
Some time in the last 25 years, ISPs stopped saying they made you “part of” the Internet, just that they’d help you “connect to” the Internet.
Most people don’t need a static IP, sure. But when ISPs stopped offering FTP and WWW hosting as a standard feature (shit though it often was), they became part of the tragic process by
which the Internet became centralised, and commoditised, and corporate, and just generally watered-down.
The amount of effort to “put something online” didn’t increase by a lot, but it increased by enough that millions probably missed-out on the opportunity to create
their first homepage.
RFC 2119 establishes language around requirement levels. Terms like “MUST”, “MUST NOT”, “SHOULD”, and “SHOULD NOT” are helpful when coordinating with engineers. I reference it a lot
for work, as I create a lot of accessible component
specifications.
Because of this familiarity—and because I’m an ass—I fired back in Discord:
I want to hire a voice actor to read 2119 in the most over the top, passive-aggressive way possible
wait, this is an achievable goal oh no
It turns out you can just pay people to do things.
I found a voice actor and hired them with the task of “Reading this very dry technical document in the most over-the-top sarcastic, passive-aggressive, condescending way possible.
Like, if you think it’s too much, take that feeling, ignore it, and crank things up one more notch.”
…
RFC 2119 is one of few RFCs I can identify by number alone, too. That and RFCs 1945 and 1866, for some reason, and RFC 2822 (and I guess, by proxy, 822) because I’ve had to implement its shitty date format more times than I’d like to count.
The fundamental difference between streaming and downloading is what your device does with those frames of video:
Does it show them to you once and then throw them away? Or does it re-assemble them all back into a video file and save it into storage?
When you’re streaming on YouTube, the video player running on your computer retains a buffer of frames ahead and behind of your current position, so you can skip around easily: the
darker grey part of the timeline shows which parts of the video are stored on – that is, downloaded to – your computer.
Buffering is when your streaming player gets some number of frames “ahead” of where you’re watching, to give you some protection against connection issues. If your WiFi wobbles
for a moment, the buffer protects you from the video stopping completely for a few seconds.
But for buffering to work, your computer has to retain bits of the video. So in a very real sense, all streaming is downloading! The buffer is the part
of the stream that’s downloaded onto your computer right now. The question is: what happens to it next?
All streaming is downloading
So that’s the bottom line: if your computer deletes the frames of video it was storing in the buffer, we call that streaming. If it retains them in a file, we
call that downloading.
That definition introduces a philosophical problem. Remember that Vimeo checkbox that lets a creator decide whether people can (i.e. are allowed to) download their videos? Isn’t
that somewhat meaningless if all streaming is downloading.
Because if the difference between streaming and downloading is whether their device belonging to the person watching the video deletes the media when they’re done. And in
virtually all cases, that’s done on the honour system.
This kind of conversation happens, over the HTTP protocol, all the time. Probably most of the time the browser is telling the truth, but there’s no way to know for certain.
When your favourite streaming platform says that it’s only possible to stream, and not download, their media… or when they restrict “downloading” as an option to higher-cost paid plans…
they’re relying on the assumption that the user’s device can be trusted to delete the media when the user’s done watching it.
But a user who owns their own device, their own network, their own screen or speakers has many, many opportunities to not fulfil the promise of deleting media it after they’ve consumed
it: to retain a “downloaded” copy for their own enjoyment, including:
Intercepting the media as it passes through their network on the way to its destination device
Using client software that’s been configured to stream-and-save, rather than steam-and-delete, the content
Modifying “secure” software (e.g. an official app) so that it retains a saved copy rather than deleting it
Capturing the stream buffer as it’s cached in device memory or on the device’s hard disk
Outputting the resulting media to a different device, e.g. using a HDMI capture device, and saving it there
Exploiting the “analogue4
hole”5:
using a camera, microphone, etc. to make a copy of what comes out of the screen/speakers6
Okay, so I oversimplified (before you say “well, actually…”)
It’s not entirely true to say that streaming and downloading are identical, even with the caveat of “…from the server’s perspective”. There are three big exceptions worth
thinking about:
Exception #1: downloads can come in any order
When you stream some linear media, you expect the server to send the media in strict chronological order. Being able to start watching before the whole file has
downloaded is a big part of what makes steaming appealing, to the end-user. This means that media intended for streaming tends to be stored in a way that facilitates that
kind of delivery. For example:
Media designed for streaming will often be stored in linear chronological order in the file, which impacts what kinds of compression are available.
Media designed for streaming will generally use formats that put file metadata at the start of the file, so that it gets delivered first.
Video designed for streaming will often have frequent keyframes so that a client that starts “in the middle” can decode the buffer without downloading too much data.
No such limitation exists for files intended for downloading. If you’re not planning on watching a video until it’s completely downloaded, the order in which the chunks arrives is
arbitrary!
But these limitations make the set of “files suitable for streaming” a subset of the set of “files suitable for downloading”. It only makes it challenging or impossible to
stream some media intended for downloading… it doesn’t do anything to prevent downloading of media intended for streaming.
Exception #2: streamed media is more-likely to be transcoded
A server that’s streaming media to a client exists in a sort-of dance: the client keeps the server updated on which “part” of the media it cares about, so the server can jump ahead,
throttle back, pause sending, etc. and the client’s buffer can be kept filled to the optimal level.
This dance also allows for a dynamic change in quality levels. You’ve probably seen this happen: you’re watching a video on YouTube and suddenly the quality “jumps” to something more
(or less) like a pile of LEGO bricks7. That’s the result of your device realising that the rate
at which it’s receiving data isn’t well-matched to the connection speed, and asking the server to send a different quality level8.
The server can – and some do! – pre-generate and store all of the different formats, but some servers will convert files (and particularly livestreams) on-the-fly, introducing
a few seconds’ delay in order to deliver the format that’s best-suited to the recipient9. That’s not necessary for downloads, where the
user will often want the highest-quality version of the media (and if they don’t, they’ll select the quality they want at the outset, before the download begins).
Exception #3: streamed media is more-likely to be encumbered with DRM
And then, of course, there’s DRM.
As streaming digital media has become the default way for many people to consume video and audio content, rights holders have engaged in a fundamentally-doomed10
arms race of implementing copy-protection strategies to attempt to prevent end-users from retaining usable downloaded copies of streamed media.
Take HDCP, for example, which e.g. Netflix use for their 4K streams. To download these streams, your device has to be running some decryption code that only works if it can trace a path
to the screen that it’ll be outputting to that also supports HDCP, and both your device and that screen promise that they’re definitely only going to show it and not make it
possible to save the video. And then that promise is enforced by Digital Content Protection LLC only granting a decryption key and a license to use it to manufacturers.11
The real hackers do stuff with software, but people who just want their screens to work properly in spite of HDCP can just buy boxes like this (which I bought for a couple of quid on
eBay). Obviously you could use something like this and a capture card to allow you to download content that was “protected” to ensure that you could only stream it, I suppose, too.
Anyway, the bottom line is that all streaming is, by definition, downloading, and the only significant difference between what people call “streaming” and
“downloading” is that when “streaming” there’s an expectation that the recipient will delete, and not retain, a copy of the video. And that’s it.
Footnotes
1 This isn’t the question I expected to be answering. I made the animation in this post
for use in a different article, but that one hasn’t come together yet, so I thought I’d write about the technical difference between streaming and downloading as an excuse to
use it already, while it still feels fresh.
2 I’m using the example of a video, but this same principle applies to any linear media
that you might stream: that could be a video on Netflix, a livestream on Twitch, a meeting in Zoom, a song in Spotify, or a radio show in iPlayer, for example: these are all examples
of media streaming… and – as I argue – they’re therefore also all examples of media downloading because streaming and downloading are fundamentally the same thing.
3 There are a few simplifications in the first half of this post: I’ll tackle them later
on. For the time being, when I say sweeping words like “every”, just imagine there’s a little footnote that says, “well, actually…”, which will save you from feeling like you have to
say so in the comments.
4 Per my style guide, I’m using the British English
spelling of “analogue”, rather than the American English “analog” which you’ll often find elsewhere on the Web when talking about the analog hole.
5 The rich history of exploiting the analogue hole spans everything from bootlegging a
1970s Led Zeppelin concert by smuggling recording equipment
in inside a wheelchair (definitely, y’know, to help topple the USSR and not just to listen to at home while you get high)
to “camming” by bribing your friendly local projectionist to let you set up a video camera at the back of the cinema for their test screening of the new blockbuster. Until some
corporation tricks us into installing memory-erasing DRM chips into our brains (hey, there’s a dystopic sci-fi story idea in there somewhere!) the analogue hole will always be
exploitable.
6 One might argue that recreating a piece of art from memory, after the fact, is a
very-specific and unusual exploitation of the analogue hole: the one that allows us to remember (or “download”) information to our brains rather than letting it “stream” right
through. There’s evidence to suggest that people pirated Shakespeare’s plays this way!
7 Of course, if you’re watching The LEGO Movie, what you’re seeing might already
look like a pile of LEGO bricks.
8 There are other ways in which the client and server may negotiate, too: for example,
what encoding formats are supported by your device.
9My NAS does live transcoding when Jellyfin streams to devices on my network, and it’s magical!
10 There’s always the analogue hole, remember! Although in practice this isn’t even
remotely necessary and most video media gets ripped some-other-way by clever pirate types even where it uses highly-sophisticated DRM strategies, and then ultimately it’s only
legitimate users who end up suffering as a result of DRM’s burden. It’s almost as if it’s just, y’know, simply a bad idea in the first place, or something. Who knew?
11 Like all these technologies, HDCP was cracked almost immediately and every
subsequent version that’s seen widespread rollout has similarly been broken by clever hacker types. Legitimate, paying users find themselves disadvantaged when their laptop won’t let
them use their external monitor to watch a movie, while the bad guys make pirated copies that work fine on anything. I don’t think anybody wins, here.
On Wednesday, Vodafone
announced that they’d made the first ever satellite video call from a stock mobile phone in an area with no terrestrial signal. They used a mountain in Wales for their experiment.
It reminded me of an experiment of my own, way back in around 1999, which I probably should have made a bigger deal of. I believe that I was the first person to ever send an email from
the top of Yr Wyddfa/Snowdon.
Nowadays, that’s an easy thing to do. You pull your phone out and send it. But back then, I needed to use a Psion 5mx palmtop, communicating over an infared link using a custom driver
(if you ever wondered why I know my AT-commands by heart… well, this isn’t exactly why, but it’s a better story than the truth) to a Nokia 7110 (fortunately it was cloudy enough to not
interfere with the 9,600 baud IrDA connection while I positioned the devices atop the trig point), which engaged a GSM 2G connection, over which I was able to send an email to myself,
cc:’d to a few friends.
It’s not an exciting story. It’s not even much of a claim to fame. But there you have it: I was (probably) the first person to send an email from the summit of Yr Wyddfa. (If you beat
me to it, let me know!)
the world needs more recreational programming.
like, was this the most optimal or elegant way to code this?
no, but it was the most fun to write.
Yes. This.
As Baz Luhrmann didn’t say, despite the implications of this video: code one thing every day that amuses you.
There is no greater way to protest the monetisation of the Web, the descent into surveillance capitalism, and the monoculture of centralised social media silos… than to create things
just for the hell of it. Maybe that’s Kirby eating a blog post. Maybe that’s whatever slippy stuff Lu put out this week. Maybe it’s a podcast exclusively about random things that interest one person.
The pre-corporate Web was fun and weird. Nowadays, keeping the Internet fun and weird is relegated to a
counterculture.
But making things (whether code, or writing, or videos, or whatever) “just because” is a critical part of that counterculture. It’s beautiful art flying in the face of rampant
commercialism. The Web provides a platform where you can share what you create: it doesn’t have to be good, or original, or world-changing… there’s value in just creating and giving
things away.
Six or seven years ago our eldest child, then a preschooler, drew me a picture of the Internet1. I framed it and I
keep it on the landing outside my bedroom – y’know, in case I get lost on the Internet and need a map:
Lots of circles, all connected to one another, passing zeroes and ones around. Around this time she’d observed that I wrote my number zeroes “programmer-style” (crossed) and clearly
tried to emulate this, too.
I found myself reminded of this piece of childhood art when she once again helped me with a network map, this weekend.
As I kick off my Automattic sabbatical I’m aiming to spend some of this and next month building a new server architecture for Three Rings. To share my plans, this weekend, I’d
been drawing network diagrams showing my fellow volunteers what I was planning to implement. Later, our eldest swooped in and decided to “enhance” the picture with faces and names for
each server:
I don’t think she intended it, but she’s made the primary application servers look the grumpiest. This might well fit with my experience of those servers, too.
I noted that she named the read-replica database server Demmy2, after our dog.
You might have come across our dog before, if you followed me through Bleptember.
It’s a cute name for a server, but I don’t think I’m going to follow it. The last thing I want is for her to overhear me complaining about some possible future server problem and
misinterpret what I’m saying. “Demmy is a bit slow; can you give her a kick,” could easily cause distress, not to mention “Demmy’s dying; can we spin up a replacement?”
I’ll stick to more-conventional server names for this new cluster, I think.
Footnotes
1 She spelled it “the Itnet”, but still: max props to her for thinking “what would
he like a picture of… oh; he likes the Internet! I’ll draw him that!”
2 She also drew ears and a snout on the Demmy-server, in case the identity wasn’t clear to
me!