I didn’t set out with the aim of getting to a hundred4, as I might well
manage tomorrow, but after a while I began to think it a real possibility. In particular, when a few different factors came together:
Travel’s given me more opportunity for geocaching (and, this last week, geohashing), as reflected in my copious checkin logs for that period.
Earlier this year, inspired by Clayton Errington, I came up with a process to streamline my mobile blogging
“flow”5. I now use a custom
Progressive Web App to provide a better interface for quickly posting on-the-move to one or both of this blog and my personal Mastodon account,
which I tested heavily during Bleptember.
Previous long streaks have sometimes been aided by pre-writing posts in bulk and then scheduling them to come out one-a-day6.
I mostly don’t do that any more: when a post is “ready”, it gets published.
I didn’t want to make a “this is my 100th day of consecutive blogging” on the 100th day. That attaches too much weight to the nice round number. But I wanted to post to
acknowledge that I’m going to make it to 100 days of consecutive blogging… so long as I can think of something worth saying tomorrow. I guess we’ll all have to wait and see.
Footnotes
1 Given that I’ve been blogging for over 26 years, that I’m still finding noteworthy
blogging “firsts” is pretty cool, I think
2 My previous record “streak” was only 37 days, so there’s quite a leap there.
3 A massive 219 posts are represented over the last 99 days: that’s an average of over 2 a
day!
theimprobable.blog, which I look after on behalf of my partner’s brother after using it to GPS-track his adventures
I think that’s all of them, but it’s hard to be sure…
Footnotes
1 Maybe I’ve finally shaken off my habit of buying a domain name for everything.
Or maybe it’s just that I’ve embraced subdomains for more stuff. Probably the latter.
Setting up and debugging your FreshRSS XPath Scraper
Okay, so here’s Adam’s blog. I’ve checked, and there’s no RSS feed1, so it’s time to start planning my XPath Scraper. The first thing I want to do is to find some way of identifying the “posts” on the page. Sometimes people use
solid, logical id="..." and class="..." attributes, but I’m going to need to use my browser’s “Inspect Element” tool to check:
The next thing that’s worth checking is that the content you’re inspecting is delivered with the page, and not loaded later using JavaScript. FreshRSS’s XPath Scraper works with the raw
HTML/XML that’s delivered to it; it doesn’t execute any JavaScript2,
so I use “View Source” and quickly search to see that the content I’m looking for is there, too.
Now it’s time to try and write some XPath queries. Luckily, your browser is here to help! If you pop up your debug console, you’ll discover that you’re probably got a predefined
function, $x(...), to which you can path a string containing an XPath query and get back a NodeList of the element.
First, I’ll try getting all of the links inside the #posts section by running $x( '//*[@id="posts"]//a' ) –
In my first attempt, I discovered that I got not only all the posts… but also the “tags” at the top. That’s no good. Inspecting the URLs of each, I noticed that the post URLs all
contained /posts/, so I filtered my query down to $x( '//*[@id="posts"]//a[contains(@href, "/posts/")]' ) which gave me the
expected number of results. That gives me //*[@id="posts"]//a[contains(@href, "/posts/")]
as the XPath query for “news items”:
Obviously, this link points to the full post, so that tells me I can put ./@href as the “item link” attribute in FreshRSS.
Next, it’s time to see what other metadata I can extract from each post to help FreshRSS along:
Inspecting the post titles shows that they’re <h3>s. Running $x( '//*[@id="posts"]//a[contains(@href, "/posts/")]//h3' ) gets them.
Within FreshRSS, everything “within” a post is referenced relative to the post, so I convert this to descendant::h3 for my “XPath (relative to item) for Item
Title:” attribute.
Inspecting within the post summary content, it’s… not great for scraping. The elements class names don’t correspond to what the content is4: it looks like Adam’s using a utility class library5.
Everything within the <a> that we’ve found is wrapped in a <div class="flex-grow">. But within that, I can see that the date is
directly inside a <p>, whereas the summary content is inside a <p>within a<div class="mb-2">. I don’t want my code to
be too fragile, and I think it’s more-likely that Adam will change the class names than the structure, so I’ll tie my queries to the structure. That gives me
descendant::div/p for the date and descendant::div/div/p for the “content”. All that remains is to tell FreshRSS that Adam’s using F j, Y as his
date format (long month name, space, short day number, comma, space, long year number) so it knows how to parse those dates, and the feed’s good.
If it’s wrong and I need to change anything in FreshRSS, the “Reload Articles” button can be used to force it to re-load the most-recent X posts. Useful if you need to tweak things. In
my case, I’ve also set the “Article CSS selector on original website” field to article so that the full post text can be pulled into my reader rather than having to visit
the actual site. Then I’m done!
Takeaways
Use Inspect Element to find the elements you want to scrape for.
Use $x( ... ) to test your XPath expressions.
Remember that most of FreshRSS’s fields ask for expressions relative to the news item and adapt accordingly.
If you make a mistake, use “Reload Articles” to pull them again.
2 If you need a scraper than executes JavaScript, you need something more-sophisticated. I
used to use my very own RSSey for this purpose but nowadays XPath Scraping is sufficient so I don’t bother any more, but RSSey might be a
good starting point for you if you really need that kind of power!
3 If you’ve not had the chance to think about it before: View Source shows you the actual
HTML code that was delivered from the web server to your browser. This then gets interpreted by the browser to generate the DOM, which might result in changes to it: for example,
invalid elements might be removed, ambiguous markup will have an interpretation applied, and so on. The DOM might further change as a result of JavaScript code, browser plugins, and
whatever else. When you Inspect Element, you’re looking at the DOM (represented “as if” it were HTML), not the actual underlying HTML
4 The date isn’t in a <time> element nor does it have a class like
.post--date or similar.
5 I’ll spare you my thoughts on utility class libraries for now, but they’re… not
positive. I can see why people use them, and I’ve even used them myself before… but I don’t think they’re a good thing.
Look at the following list of words and try to find the intruder:
wp-activate.php
wp-admin
wp-blog-header.php
wp_commentmeta
wp_comments
wp-comments-post.php
wp-config-sample.php
wp-content
wp-cron.php
wp engine
wp-includes
wp_jetpack_sync_queue
wp_links
wp-links-opml.php
wp-load.php
wp-login.php
wp-mail.php
wp_options
wp_postmeta
wp_posts
wp-settings.php
wp-signup.php
wp_term_relationships
wp_term_taxonomy
wp_termmeta
wp_terms
wp-trackback.php
wp_usermeta
wp_users
What are these words?
Well, all the ones that contain an underscore _ are names of the WordPress core database tables. All the ones that contain a dash - are WordPress core file
or folder names. The one with a space is a company name…
…
A smart (if slightly tongue-in-cheek) observation by my colleague Paolo, there. The rest of his article’s cleverer and worth-reading if you’re following the WordPress Drama (but it’s
pretty long!).
tl;dr: I’m tidying up and consolidating my personal hosting; I’ve made a little progress, but I’ve got a way to go – fortunately I’ve got a sabbatical coming up at
work!
At the weekend, I kicked-off what will doubtless be a multi-week process of gradually tidying and consolidating some of the disparate digital things I run, around the Internet.
I’ve a long-standing habit of having an idea (e.g. gamebook-making tool Twinebook, lockpicking puzzle game Break Into Us, my Cheating Hangman game, and even FreeDeedPoll.org.uk!),
deploying it to one of several servers I run, and then finding it a huge headache when I inevitably need to upgrade or move said server because there’s such an insane diversity of
different things that need testing!
I can simplify, I figured. So I did.
And in doing so, I rediscovered several old projects I’d neglected or forgotten about. I wonder if anybody’s still using any of them?
DNDle, my Wordle-clone where you have to guess the Dungeons & Dragons 5e monster’s stat block, is now hosted by GitHub Pages. Also, I
fixed an issue reported a month ago that meant that I was reporting Giant Scorpions as having a WIS of 19 instead of 9.
Abnib, which mostly reminds people of upcoming birthdays and serves as a dumping ground for any Abnib-related shit I produce, is now hosted by
GitHub Pages.
RockMonkey.org.uk, which doesn’t really do much any more, is now hosted by GitHub Pages.
Sour Grapes, the single-page promo for a (remote) murder mystery party I hosted during a COVID lockdown, is now hosted by GitHub
Pages.
A convenience-page for giving lost people directions to my house is now hosted by GitHub Pages.
Dan Q’s Things is now automatically built on a schedule and hosted by GitHub Pages.
Robin’s Improbable Blog, which spun out from 52 Reflect, wasn’t getting enough traffic to justify
“proper” hosting so now it sits in a Docker container on my NAS.
My μlogger server, which records my location based on pings from my phone, has also moved to my NAS. This has broken
Find Dan Q, but I’m not sure if I’ll continue with that in its current form anyway.
All of my various domain/subdomain redirects have been consolidated on, or are in the process of moving to, to a tinyLinode/Akamai
instance. It’s a super simple plain Nginx server that does virtually nothing except redirect people – this is where I’ll park the domains I register but haven’t found a use for yet, in
future.
It turns out GitHub pages is a fine place to host simple, static websites that were open-source already. I’ve been working on improving my understanding of GitHub Actions
anyway as part of what I’ve been doing while wearing my work, volunteering, and personal hats, so switching some static build processes like DNDle’s to GitHub
Actions was a useful exercise.
Stuff I’m still to tidy…
There’s still a few things I need to tidy up to bring my personal hosting situation under control:
DanQ.me
This is the big one, because it’s not just a WordPress blog: it’s also a Gemini, Spartan, and Gopher server (thanks CapsulePress!), a Finger server, a general-purpose host to a stack of complex stuff only some of which is powered by Bloq (my WordPress/PHP integrations): e.g.
code to generate the maps that appear on my geopositioned posts, code to integrate with the Fediverse, a whole stack of configuration to make my caching work the way I want, etc.
FreeDeedPoll.org.uk
Right now this is a Ruby/Sinatra application, but I’ve got a (long-running) development branch that will make it run completely in the browser, which will further improve privacy, allow
it to run entirely-offline (with a service worker), and provide a basis for new features I’d like to provide down the line. I’m hoping to get to finishing this during my Automattic
sabbatical this winter.
A secondary benefit of it becoming browser-based, of course, is that it can be hosted as a static site, which will allow me to move it to GitHub Pages too.
When I took over running the world’s geohashing hub from xkcd‘s Randall Munroe (and davean), I flung the site together on whatever hosting I had sitting
around at the time, but that’s given me some headaches. The outbound email transfer agent is a pain, for example, and it’s a hard host on which to apply upgrades. So I want to get that
moved somewhere better this winter too. It’s actually the last site left running on its current host, so it’ll save me a little money to get it moved, too!
Right now I run this on my NAS, but that turns out to be a pain sometimes because it means that if my home Internet goes down (e.g. thanks to a power cut, which we have from time to time), I lose access to the first and last place I
go on the Internet! So I’d quite like to move that to somewhere on the open Internet. Haven’t worked out where yet.
Next steps
It’s felt good so far to consolidate and tidy-up my personal web hosting (and to rediscover some old projects I’d forgotten about). There’s work still to do, but I’m expecting to spend
a few months not-doing-my-day-job very soon, so I’m hoping to find the opportunity to finish it then!
Maintaining a blog can be a lot of work. A single article can take weeks of research, drafting and editing, collecting and producing included materials, etc. It’s not unusual to
seek some form of compensation for it, and those rewards require initiative. With a good monetization strategy, it can become a fairly
lucrative venture.
So let’s talk about monetizing a blog, starting with the most obvious and perhaps easiest avenue: display advertising.
A content creator with an established audience can leverage that audience and sell ad space on their blog. Here’s an example:
…
I’m not sure I have words for how awesome this blog post is. If you’ve ever wanted to monetise your blog and are considering an ad-driven model, this should absolutely be the first (and
perhaps last) thing you read on the subject.
If you’re not convinced that Tyler is an appropriate authority to speak on this subject, I highly suggest you visit their other site that’s got a wealth of useful tips, PutAToothpickInTheChargingPortDoctorsHateThatShit.christmas. Yes, really.
If the most useful thing I achieve this Bank Holiday Monday will have been to make it easier to post short geotagged notes from my mobile to my blog (and Mastodon), it will have been a
success.
This has been a test post. Feel free to ignore it.
I used to pay for VaultPress. Nowadays I get it for free as one of the many awesome perks of my job. But I’d probably still pay for it
because it’s a lifesaver.
Like my occasional video content, this isn’t designed to replace any of my blogging: it’s just a different medium for those that might prefer it.
For some stories, I guess that audio might be a better way to find out what I’ve been thinking about. Just like how the vlog version of my post about
my favourite video game Easter Egg might be preferable because video as a medium is better suited to demonstrating a computer game, perhaps
audio’s the right medium for some of the things I write about, too?
But as much as not, it’s just a continuation of my efforts to explore different media over which a WordPress blog can be delivered2.
Also, y’know, my ongoing effort to do what I’m bad at in the hope that I might get better at a wider diversity of skills.
How?
Let’s start by understanding what a “podcast” actually is. It is, in essence, just an RSS feed (something you might have heard me talk about before…) with audio enclosures – basically, “attachments” – on each item. The idea was spearheaded by Dave Winer back in 2001 as a
way of subscribing to rich media like audio or videos in such a way that slow Internet connections could pre-download content so you didn’t have to wait for it to buffer.3
Here’s what I had to do to add podcasting capability to my theme:
The tag
I use a post tag, dancast, to represent posts with accompanying podcast content4.
This way, I can add all the podcast-specific metadata only if the user requests the feed of that tag, and leave my regular feeds untampered . This means that you don’t
get the podcast enclosures in the regular subscription; that might not be what everybody would want, but it suits me to serve podcasts only to people who explicitly ask for
them.
Okay, onto the code (which I’ve open-sourced over here). I’ve use a series of standard WordPress hooks to
add the functionality I need. The important bits are:
rss2_item – to add the <enclosure>, <itunes:duration>, <itunes:image>, and
<itunes:explicit> elements to the feed, when requesting a feed with my nominated tag. Only <enclosure> is strictly required, but appeasing Apple
Podcasts is worthwhile too. These are lifted directly from the post metadata.
the_excerpt_rss – I have another piece of post metadata in which I can add a description of the podcast (in practice, a list of chapter times); this hook
swaps out the existing excerpt for my custom one in podcast feeds.
rss_enclosure – some podcast syndication platforms and players can’t cope with RSS feeds in which an item has multiple enclosures, so as a
safety precaution I strip out any enclosures that WordPress has already added (e.g. the featured image).
the_content_feed – my RSS feed usually contains the full text of every post, because I don’t like feeds that try to force you to go to the
original web page5
and I don’t want to impose that on others. But for the podcast feed, the text content of the post is somewhat redundant so I drop it.
rss2_ns – of critical importance of course is adding the relevant namespaces to your XML declaration. I use the itunes namespace, which provides the widest compatibility for specifying metadata, but I also use the
newer podcast namespace, which has growing compatibility and provides some modern features, most of which I don’t
use except specifying a license. There’s no harm in supporting both.
rss2_head – here’s where I put in the metadata for the podcast as a whole: license, category, type, and so on. Some of these fields are
effectively essential for best support.
You’re welcome, of course, to lift any of all of the code for your own purposes. WordPress makes a perfectly reasonable platform for podcasting-alongside-blogging, in my experience.
What?
Finally, there’s the question of what to podcast about.
My intention is to use podcasting as an alternative medium to my traditional blog posts. But not every blog post is suitable for conversion into a podcast! Ones that rely on images
(like my post about dithering) aren’t a great choice. Ones that have lots of code that you might like to copy-and-paste are especially unsuitable.
Also: sometimes I just can’t be bothered. It’s already some level of effort to write a blog post; it’s like an extra 25% effort on top of that to record, edit, and upload a podcast
version of it.
That’s not nothing, so I’ve tended to reserve podcasts for blog posts that I think have a sort-of eccentric “general interest” vibe to them. When I learn something new and feel the need
to write a thousand words about it… that’s the kind of content that makes it into a podcast episode.
Which is why I’ve been calling the endeavour “a podcast nobody asked for, about things only Dan Q cares about”. I’m capable of getting nerdsniped
easily and can quickly find my way down a rabbit hole of learning. My podcast is, I guess, just a way of sharing my passion for trivial deep dives with the rest of the world.
My episodes are probably shorter than most podcasts: my longest so far is around fifteen minutes, but my shortest is only two and a half minutes and most are about seven. They’re meant
to be a bite-size alternative to reading a post for people who prefer to put things in their ears than into their eyes.
Anyway: if you’re not listening already, you can subscribe from here or in your favourite podcasting app. Or you can just follow my blog as normal
and look for a streamable copy of podcasts at the top of selected posts (like this one!).
2 As well as Web-based non-textual content like audio (podcasts) and video (vlogs), my blog is wholly or partially available over a variety of more-exotic protocols: did you find me yet on Gemini (gemini://danq.me/), Spartan (spartan://danq.me/), Gopher (gopher://danq.me/), and even Finger
(finger://danq.me/, or run e.g. finger blog@danq.me from your command line)? Most of these are powered by my very own tool CapsulePress, and I’m itching to try a few more… how about a WordPress blog that’s accessible over FTP, NNTP, or DNS? I’m not even kidding when I say
I’ve got ideas for these…
3 Nowadays, we have specialised media decoder co-processors which reduce the size of media
files. But more-importantly, today’s high-speed always-on Internet connections mean that you probably rarely need to make a conscious choice between streaming or downloading.
4 I actually intended to change the tag to podcast when I went-live,
but then I forgot, and now I can’t be bothered to change it. It’s only for my convenience, after all!
Runners will talk about how much they enjoy the feeling of wind in their hair. Boxers won’t shut up about the grace and art of their profession. Even soccer players can be moved to
wax poetical about how enjoyable it is to be part of a truly great game.
But all golfers ever talk about is how little golf they hope to play. A typical pre-match interview will go something like this:
Some guy in a blazer: Great to have you here with us, what are your goals for the first round this morning.
Golfer: Well today I hope to play as little golf as possible. Mathematically speaking the course could be done in 18 shots but that is probably physically
impossible. But ideally as close to 18 as I can get. Any additional golf is bad.
Blazer: What is your strategy for avoiding the golf.
Golfer: I have a guy who follows me around to help share the burden of all this damn golf. He is going to help me out by suggesting ways to avoid playing any more
golf than we have to. Of course, I pay him but his real motivation is to bring this sorry excuse for a pastime to the speediest conclusion.
Blazer: Better you than me, but good luck out there.
Why must a blog comment be text? Why could it not be… a drawing?1
I started hacking about and playing with a few ideas and now, on selected posts including this one, you can draw me a comment instead of typing one.
I opened the feature, experimentally (in a post available only to RSS subscribers2) the
other week, but now you get a go! Also, I’ve open-sourced the whole thing, in case you want to pick it apart.
What are you waiting for: scroll down, and draw me a comment!
Footnotes
1 I totally know the reasons that a blog comment shouldn’t be a drawing; I’m not
completely oblivious. Firstly, it’s less-expressive: words are versatile and you can do a lot with them. Secondly, it’s higher-bandwidth: images take up more space, take longer to
transmit, and that effect compounds when – like me – you’re tracking animation data too. But the single biggest reason, and I can’t stress this enough, is… the
penises. If you invite people to draw pictures on your blog, you’re gonna see a lot of penises. Short penises, long penises, fat penises, thin penises. Penises of every shape
and size. Some erect and some flacid. Some intact and some circumcised. Some with hairy balls and some shaved. Many of them urinating or ejaculating. Maybe even a few with smiley
faces. And short of some kind of image-categorisation AI thing, you can’t realistically run an anti-spam tool to detect hand-drawn penises.
2 I’ve copied a few of my favourites of their drawings below. Don’t forget to subscribe if you want early access to any weird shit I make.
In anticipation of WWW Day on 1 August, some work colleagues and I were
sharing pictures of the first (or early) websites we worked on. I was pleased to be able to pull out a screenshot of how my blog looked back in 1999!
Because I’m such a digital preservationist, many of those ancient posts are still available on my blog, so I also shared a photo of me browsing the same content on my
blog as it is today, side-by-side with that 25+-year-old screenshot.1
I’ve even applied img { image-rendering: crisp-edges; } to try to compensate for modern browsers’ capability for subpixel rendering when rescaling images: let them
eat pixels!5
I’ve added 1999 Mode to my April Fools gags so, like this year, if you happen to visit my site on or around 1 April,
there’s a change you’ll see it in 1999 mode anyway. What fun!
I think there’s a possible future blog post about Web design challenges of the 1990s. Things like: what it the user agent doesn’t support images? What if it supports GIFs, but not
animated ones (some browsers would just show the first frame, so you’d want to choose your first frame appropriately)? How do I ensure that people see the right content if they skip my
frameset? Which browser-specific features can I safely use, and where do I need a fallback6? Will this
work well on all resolutions down to 640×480 (minus browser chrome)? And so on.
Any interest in that particular rabbit hole of digital history?
Footnotes
1 Some of the addresses have changed, but from Summer 2003 onwards I’ve had a solid chain
of redirects in place to try to keep content available via whatever address it was at. Because Cool URIs Don’t Change. This occasionally turns out to be useful!
2 Actually, the entire theme is just a CSS change, so no tables are added. But I’ve tried to make it look like I’m using tables for layout, because that (and spacer GIFs) were all
we had back in the day.
3 Obviously the title saying “Dan Q” is modern, because that
wasn’t even my name back then, but this is more a reimagining of how my site would have looked if I were transported back to 1999 and made to do it all again.
4 I was slightly obsessed for a couple of years in the late 90s with flaming text on black
marble backgrounds. The hit counter in my screenshot above – with numbers on fire – was one I made, not a third-party one; and because mine was the only one of my friends’
hosts that would let me run CGIs, my Perl script powered the hit counters for most of my friends’ sites too.
5 I considered, but couldn’t be bothered, implementing an SVG CSS filter: to posterize my images down to 8-bit colour, for that real
“I’m on an old graphics card” feel! If anybody’s already implemented such a thing under a license that I can use, let me know and I’ll integrate it!
Okay, we’re gonna need a whole lot of caveats on the “this is 5,000” claim:
Engage pedantry mode
First, there’s a Ship of Theseus consideration. By “this blog”, I’m referring to what I feel is a continuation (with short
breaks) of my personal diary-style writing online from the original “Avatar Diary” on castle.onza.net in the 1990s via “Dan’s Pages” on avangel.com in the 2000s through the relaunch on scatmania.org in 2003 through migrating to danq.me in 2012. If you feel that a change of domain precludes continuation, you might
disagree with me. Although you’d be a fool to do so: clearly a blog can change its domain and still be the same blog, right? Back in 2018 I celebrated the 20th anniversary of my first blog post by revisiting how my blog had looked, felt, and changed over the decades, if
you’re looking for further reading.
Similarly, one might ask if retroactively republishing something that originally went out via a different medium “counts”2.
In late 1999 I ran “Cool Thing of the Day (to do at the University of Wales, Aberystwyth)” as a way of staying connected to my friends back in
Preston as we all went our separate ways to study. Initially sent out by email, I later maintained a web page with a log of the entries I’d sent out, but the address wasn’t
publicly-circulated. I consider this to be a continuation of the Avatar Diary before it and the predecessor to Dan’s Pages on avangel.com after it, but a pedant might argue that because
the content wasn’t born as a blog post, perhaps it’s invalid.
Pedants might also bring up the issue of contemporaneity. In 2004 a server fault resulted in the loss of a significant number of
149 blog posts, of which only 85 have been fully-recovered. Some were resurrected from
backups as late as 2012, and some didn’t recover their lost images until later still – this one had some content recovered as late as 2017! If you consider the absence of a pre-2004 post until
2012 a sequence-breaker, that’s an issue. It’s theoretically possible, of course, that other old posts might be recovered and injected, and this post might before the 5,001st, 5,002nd,
or later post, in terms of chronological post-date. Who knows!
Then there’s the posts injected retroactively. I’ve written software that, since 2018, has ensured that
my geocaching logs get syndicated via my blog when I publish them to one of the other logging sites I use, and I retroactively imported all of my
previous logs. These never appeared on my blog when they were written: should they count? What about more egregious examples of necroposting, like this post dated long before I ever touched a keyboard? I’m counting them all.
I’m also counting other kinds of less-public content too. Did you know that I sometimes make posts that don’t appear on my front page,
and you have to subscribe e.g. by RSS to get them? They have web addresses – although search
engines are discouraged from indexing them – and people find them with or without subscribing. Maybe you should subscribe if you haven’t already?
Let’s take a look at some of those previous milestone posts:
In post 1,000 I announced that I was ready for 2005’s NaNoWriMo. I had a big ol’ argument in the comments with
Statto about the value of the exercise. It’s possible that I ultimately wrote more words arguing with him than I did on my writing project that
month.
Dave Winer kindly let me know about a proposed
standard for linking to OPML blogrolls. Given that I added a page
containing my blogroll last year, it was easy enough for me to add a tiny bit of code to the header to add support for automatic detection of my blogroll.
Now all we need is some tools that can do such detection!
(You’ll note I’ve added a title attribute: as I discovered the other day, some browsers including ELinks will show all
<link>s of unknown rel="..." at the top of the page and I wanted this one to make sense!)
theunderground.blog‘s content, with the exception of its homepage, is delivered entirely through an XML Atom feed. Atom feed entries do require <title>s, of course, so that’s not the strongest counterexample!
This blog is available over several media other than the Web. For example, you can read this blog post:
We’ve looked at plain text, which as a format clearly does not have to have a title. Let’s go one step further and implement it. What we’d need is:
A webserver configured to deliver plain text files by preference, e.g. by adding directives like index index.txt; (for Nginx).5
An index page listing posts by date and URL. Most browser won’t render these as “links” so users will have to copy-paste
or re-type them, so let’s keep them short,
Pages for each post at those URLs, presumably without any kind of “title” (just to prove a point), and
An RSS feed: usually I use RSS as shorthand for all feed
types, but this time I really do mean RSS and not e.g. Atom because RSS, strangely, doesn’t require that an <item> has a <title>!
In the end I decided it’d benefit from being automated as sort-of a basic flat-file CMS, so I wrote it in PHP. All requests are routed by the webserver to the program, which determines whether they’re a request for the homepage, the RSS feed, or a valid individual post, and responds accordingly.
It annoys me that feed
discovery doesn’t work nicely when using a Link: header, at least not in any reader I tried. But apart from that, it seems pretty solid, despite its limitations. Is this,
perhaps, an argument for my.well-known/feedsproposal?