This Page is Designed to Last: A Manifesto for Preserving Content on the Web

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

How do we make web content that can last and be maintained for at least 10 years? As someone studying human-computer interaction, I naturally think of the stakeholders we aren’t supporting. Right now putting up web content is optimized for either the professional web developer (who use the latest frameworks and workflows) or the non-tech savvy user (who use a platform).

But I think we should consider both 1) the casual web content “maintainer”, someone who doesn’t constantly stay up to date with the latest web technologies, which means the website needs to have low maintenance needs; 2) and the crawlers who preserve the content and personal archivers, the “archiver”, which means the website should be easy to save and interpret.

So my proposal is seven unconventional guidelines in how we handle websites designed to be informative, to make them easy to maintain and preserve. The guiding intention is that the maintainer will try to keep the website up for at least 10 years, maybe even 20 or 30 years. These are not controversial views necessarily, but are aspirations that are not mainstream—a manifesto for a long-lasting website.

This page is designed to last, too. In fact, virtually every post of any type I’ve made to this blog (since 2003, older content may vary) has been designed with the intention that it ought to be accessible without dependence on CSS, Javascript, nor any proprietary technology, that the code should be as human-readable as posssible, and that the site itself should be as “archivable” as possible, just as a matter of course.

But that’s only 15 years of dedicated effort to longevity and I’ve still not achieved 100% success! For example, consider my blog post of 14 December 2003, describing the preceeding Troma Night, whose content was lost during the great server failure of July 2004 and for which the backups were unable to completely describe. I’m more-careful now, with more redundancies and backups, but it’s still always going to be the case that a sufficiently-devastating set of simultaneous failures could take this content away. All information has fragility and we can work to mitigate it but we can never completely solve it.

The large number of dead outbound links on the older parts of my site is both worrying – that most others don’t seem to have the same level of commitment to the retention of articles on the Web – and reassuring – that I’m doing significantly better than the average. So next, I guess, I need to focus my attention – like Jeff is – on how we can make such efforts usable by other people, too. The Web belongs to all of us, after all.