If the most useful thing I achieve this Bank Holiday Monday will have been to make it easier to post short geotagged notes from my mobile to my blog (and Mastodon), it will have been a success.
This has been a test post. Feel free to ignore it.
Dan Q
In anticipation of WWW Day on 1 August, some work colleagues and I were sharing pictures of the first (or early) websites we worked on. I was pleased to be able to pull out a screenshot of how my blog looked back in 1999!
Because I’m such a digital preservationist, many of those ancient posts are still available on my blog, so I also shared a photo of me browsing the same content on my blog as it is today, side-by-side with that 25+-year-old screenshot.1
Update: This photo eventually appeared on a LinkedIn post on Automattic’s profile.
This inspired me to make a toggleable “alternate theme” for my blog: 1999 Mode.
Switch to it, and you’ll see a modern reinterpretation of my late-1990s blog theme, featuring:
img { image-rendering: crisp-edges; }
to try to compensate for modern browsers’ capability for subpixel rendering when rescaling images: let them
eat pixels!5
I’ve added 1999 Mode to my April Fools gags so, like this year, if you happen to visit my site on or around 1 April, there’s a change you’ll see it in 1999 mode anyway. What fun!
I think there’s a possible future blog post about Web design challenges of the 1990s. Things like: what it the user agent doesn’t support images? What if it supports GIFs, but not animated ones (some browsers would just show the first frame, so you’d want to choose your first frame appropriately)? How do I ensure that people see the right content if they skip my frameset? Which browser-specific features can I safely use, and where do I need a fallback6? Will this work well on all resolutions down to 640×480 (minus browser chrome)? And so on.
Any interest in that particular rabbit hole of digital history?
1 Some of the addresses have changed, but from Summer 2003 onwards I’ve had a solid chain of redirects in place to try to keep content available via whatever address it was at. Because Cool URIs Don’t Change. This occasionally turns out to be useful!
2 Actually, the entire theme is just a CSS change, so no tables are added. But I’ve tried to make it look like I’m using tables for layout, because that (and spacer GIFs) were all we had back in the day.
3 Obviously the title saying “Dan Q” is modern, because that wasn’t even my name back then, but this is more a reimagining of how my site would have looked if I were transported back to 1999 and made to do it all again.
4 I was slightly obsessed for a couple of years in the late 90s with flaming text on black marble backgrounds. The hit counter in my screenshot above – with numbers on fire – was one I made, not a third-party one; and because mine was the only one of my friends’ hosts that would let me run CGIs, my Perl script powered the hit counters for most of my friends’ sites too.
5 I considered, but couldn’t be bothered, implementing an SVG CSS filter:
to posterize my images down to 8-bit colour, for that real
“I’m on an old graphics card” feel! If anybody’s already implemented such a thing under a license that I can use, let me know and I’ll integrate it!
6 And what about those times where using a fallback might make things worse, like
how Netscape 7 made the <blink><marquee>
combination unbearable!
This is a repost promoting content originally published elsewhere. See more things Dan's reposted.
Enbies and gentlefolk of the class of ‘24:
Write websites.
If I could offer you only one tip for the future, coding would be it. The long term benefits of coding websites remains unproved by scientists, however the rest of my advice has a basis in the joy of the indie web community’s experiences. I will dispense this advice now:
Enjoy the power and beauty of PHP; or never mind. You will not understand the power and beauty of PHP until your stack is completely jammed. But trust me, in 20 years you’ll look back at your old sites and recall in a way you can’t grasp now, how much possibility lay before you and how simple and fast they were. JS is not as blazingly fast as you imagine.
Don’t worry about the scaling; or worry, but know that premature scalability is as useful as chewing bubble gum if your project starts cosy and small. The real troubles on the web are apt to be things that never crossed your worried mind; if your project grows, scale it up on some idle Tuesday.
Code one thing every day that amuses you.
…
I can’t say I loved Baz Luhrmann’s Everybody’s Free To Wear Sunscreen. I’m not sure it’s possible for anybody who lived through it being played to death in the late 1990s; a period of history when a popular song was basically inescapable. Also, it got parodied a lot. I must’ve seen a couple of dozen different parodies of varying quality in the early 2000s.
But it’s been long enough that I was, I guess, ready for one. And I couldn’t conceive of a better topic.
Y’see: the very message of the value of personal websites is, like Sunscreen, a nostalgic one. When I try to sell people on the benefits of a personal digital garden or blog, I tend to begin by pointing out that the best time to set up your own website is… like 20+ years ago.
But… the second-best time to start a personal website is right now. With cheap and free static hosting all over the place (and more-dynamic options not much-more expensive) and domain names still as variably-priced as they ever were, the biggest impediment is the learning curve… which is also the fun part! Siloed social media is either eating its own tail or else fighting to adapt to once again be part of a more-open Web, and there’s nothing that says “I’m part of the open Web” like owning your own online identity, carving out your own space, and expressing yourself there however you damn well like.
As always, this is a drum I’ll probably beat until I die, so feel free to get in touch if you want some help getting set up on the Web.
The ever-excellent Kev Quirk in 2020 came up with this challenge: write a blog post on each of 100 consecutive days. He called it #100DaysToOffload, in nominal reference to the “100 days of code” challenge. I was reflecting upon this as I reach this, my 36th consecutive day of blogging and my longest ever “daily streak” (itself a spin-off of my attempt at Bloganuary this year), and my 48th post of the year so far.
Might I meet that challenge? Maybe. But it turns out it’s easier than I thought because Kev revised the rules to require only 100 posts in a calendar year (or any other 365-day period, but I’m not going to start thinking about the maths of that).
That’s not only much more-achievable… I’ve probably already achieved it! Let’s knock out some SQL to check how many posts I made each year:
A big question in some years is what counts as a post. Kev’s definition is quite liberal and includes basically-everything, but I wonder if mine shouldn’t perhaps be stricter. For example:
Year | Posts | Success? | Notes |
---|---|---|---|
1998 | 7 | ❌ No | Some posts are lost from 1998/1999. If they were recovered I might have made 100 posts in 1999, but probably not in 1998 as I only started blogging on 27 September 1998. |
1999 | 66 | ❌ No | |
2000 | 2 | ❌ No | |
2001 | 11 | ❌ No | |
2002 | 5 | ❌ No | |
2003 | 189 | 🏆 Yes | Achieved 1 September, with a post about an article on The Register about timewasting. Or, if we allow reposts, three days earlier with a repost about Claire's car being claimed by the sea. |
2004 | 374 | 🏆 Yes | An early win on 20 April, with a made-up Chez Geek card. Or if we allow reposts, two days earlier with thoughts on a confusing pro-life (???) website. |
2005 | 381 | 🏆 Yes | In a highly-productive year of blogging, achieved on 7 April with a post about enjoy curry and public information films with friends. If we allow bookmarks (I was highly-active on del.icio.us at the time!), achieved even earlier on 18 February with some links to curious websites. |
2006 | 206 | 🏆 Yes | On 21 July, I shared a personality test (which was actually my effort to repeat an experiment in using Barnum-Forer statements) - I didn't initially give away that I was the author of the "test". Non-pedants will agree I achieved the goal earlier, on 19 June, with my thoughts on a programming language for a hypothetical infinitely-fast computer. |
2007 | 166 | 🏆 Yes | Achieved on 2 July with thoughts on films I'd watched and board games I'd played recently. Or arguably 12 days earlier with Claire's birthday trip to Manchester. |
2008 | 86 | ❌ No | |
2009 | 79 | ❌ No | |
2010 |
159
(84 for pedants) |
✅ Yes* | A heartfelt post about saying goodbye to Aberystwyth as I moved to Oxford on 16 June was my 100th of the year. Pedants might argue that this year shouldn't count, but so long as you're willing to count checkins (and you should) then it would... and my qualifying post would have come only a couple of days later, with a post about the Headington Shark, which I had just moved-in near to. |
2011 | 177 | 🏆 Yes | Reached the goal on 28 October when I wrote about mild successes in my enquiries with the Office of National Statistics about ensuring that information about polyamorous households was accurately recorded. Or if we earlier on 9 June with a visual gag about REM lyrics if you accept all my geocache logs as posts too (and again: you should). |
2012 |
129
(87 for pedants) |
✅ Yes* | My 100th post of the year came on 28 August when I wrote about launching a bus named after my recently-deceased father. You have to be willing to accept both checkins and reposts as posts to allow this year to count. |
2013 |
138
(59 for pedants) |
😓 Probably not | I'm not convined this low-blogging year should count: a clear majority of the posts were geocaching logs, and they weren't always even that verbose (consider this candidate for 100th post of 2013, from 1 October). |
2014 |
335
(22 for pedants) |
🙁 Not really | Another geocache log heavy, conventional blogpost light year that I'm not convinced should count, evem if the obvious candidate for 100th post would be 18 May's cool article about geocaching like Batman! |
2015 |
205
(18 for pedants) |
🙁 Not really | Still no, for the same reasons as above. |
2016 |
163
(37 for pedants) |
🙁 Not really | |
2017 |
301
(42 for pedants) |
🙁 Not really | |
2018 |
547
(87 for pedants) |
✅ Yes* | I maintain that checkins should count, even when they're PESOS'd from geocaching sites, so long as they don't make up a majority of the qualifying posts in a year. In which case this year should qualify, with the 100th post being my visit to this well-hidden London pub while on my way to a conference. |
2019 |
387
(86 for pedants) |
✅ Yes* | Similarly this year, when on 15 August I visited a GNSS calibration point in the San Francisco Bay Area... on the way to another conference! |
2020 |
221
(64 for pedants) |
✅ Yes* | Barely made it this year (ignoring reposts, of which I did lots), with my 21 December article about a little-known (and under-supported) way to inject CSS using HTTP headers, which I later used to make a web page for which View Souce showed nothing. |
2021 |
190
(57 for pedants) |
✅ Yes* | A cycle to a nearby geocache was the checkin that made the 100th post of this year, on 27 August. |
2022 |
168
(55 for pedants) |
✅ Yes* | My efforts to check up on one of my own geocaches on 7 September scored the qualifying spot. |
2023 |
164
(86 for pedants) |
✅ Yes* | My blogging ramped up again this year, and on 24 August I shared a motivational poster with a funny twist, plus a pun at the intersection between my sexuality and my preferred mode of transport. |
2024 | 313 | 🏆 Yes | Writing at full-tilt, my hundredth post came when I found a geocache near Regents Canal, but pedants who disregard reposts and checkins might instead count my excitement at the Ladybird Web browser as the record-breaker. This year also saw me write my 5,000th post on this blog! Wowza! |
Total | 5,174 |
Total count of all the posts.
Doesn't add up? Not all posts feature in one of the years above! |
* Pedants might claim this year was not a success for the reasons described above. Make your own mind up.
In any case, I’d argue that I clearly achieved the revised version of the challenge on certainly six, probably fourteen, arguably (depending on how you count posts) as many as nineteen different years since I started blogging in 1998. My least-controversial claims would be:
Given all these unanswered questions, I’m not going to just go ahead and raise a PR against the Hall of Fame! Instead, I’ll leave it to Kev to decide whether I’m (a) eligible to claim a 14-time award, (b) merely eligible for a 4-time award for the years following the challenge starting, or (c) ineligible to claim success until I intentionally post 100 times in a year (in, at current rates, another two months…). Over to you, Kev…
Update: Kev’s agreed that I can claim the most-recent four of them, so I raised a PR.
This post is part of my attempt at Bloganuary 2024. Today’s prompt is:
In what ways do you communicate online?
What a curious question! For me, it’s perhaps best divided into public and private communication, for which I use very different media:
I’ve written before about how this site – my blog – is the centre of my digital “ecosystem”. And while the technical details may have changed since that post was published, the fundamentals have not: everything about my public communication revolves around this, right here.
For example:
For private communication online, I perhaps mostly use the following (in approximate order of volume):
That’s a very different set of tech stacks than I use in my “public” communication!
1 My thinking is, at least in part: I’ve seen platforms come and go, and my blog’s outlived them. I’ve seen platforms change their policies or technology in ways that undermine the content I put on them, but the stuff on my blog remains under my control and I can “fix” it if I wish. Owning your data is awesome, although I perhaps do it to a more-extreme extent than many.
2 I’ve used to joke that I syndicate content to e.g. Facebook to support readers who haven’t learned yet to use a feed reader. I used to, and I still do, too.
3 A great thing about having a “personal” Slack installation is that you can hook up your own integrations and bots to e.g. remind you to bring the milk in.
4 I’ve been experimenting with Texts to centralise several of my other platforms; I’m not convinced by it yet, but I love the thinking! Long ago, I used to love using Pidgin for simultaneous access to IRC, ICQ, MSN Messenger, Google Talk, Yahoo! Messenger and all that jazz, so I fully approve of the concept.
5 Okay, not actually old-fashioned because I’m not suggesting you use
UUCP to send mail to protonmail!danq!dan
or DECnet to deliver to danq.me::dan
or something!
6 Most of the metadata including sender, recipient, and in most cases even subject is not encrypted.
Tracy Durnell’s post about blogrolls really spoke to me. Like her, I used to think of a blogroll as a list of people you know personally (who happen to blog)1, but the number of bloggers among my immediate in-person circle of friends has shrunk from several dozen to just a handful, and I dropped my blogroll in around 2008.
But my connection to a wider circle has grown, and like Tracy I enjoy the “hardly strangers” connection I feel with the people I follow online. She writes:
While social media emphasizes the show-off stuff — the vacation in Puerto Vallarta, the full kitchen remodel, the night out on the town — on blogs it still seems that people are sharing more than signalling. These small pleasures seem to be offered in a spirit of generosity — this is too beautiful not to share.
…
Although I may never interact with all the folks whose blogs I follow, reading the same blogger for a long time does build a (one-sided) connection. I may not know you, author, but I am rooting for you. It’s a different modality of relationship than we may be used to in person, but it’s real: a parasocial relationship simmering with the potential for deeper connection, but also satisfying as it exists.
My first bloggy pan pal, Colin Walker, who I started exchanging emails with earlier this month, followed-up on this with an observation that really gets to the heart of the issue (speaking as somebody who’s long said that my blog’s intended audience is, first and foremost, me):
At its core, blogging is a solitary activity with many (if not most) authors claiming that their blog is for them – myself included. Yet, the implication of audience cannot be ignored. Indeed, the more an author embeds themself in the loose community of blogs, by reading and linking to others, the more that implication becomes reality even if not actively pursued via comments or email.
To that end: I’ve started publishing my blogroll again! Follow that link and you’ll see an only-lightly-curated list of all the people (plus some non-personal blogs, vlogs, and webcomics) I follow (that have updated their feeds within the last year2). Naturally, there’s an OPML version too, and I’ve open-sourced the code I used to generate it (although I can’t imagine anybody’s situation is enough like mine for it to be useful).
The page is a little flaky and there’s things I’d like to do to improve it, but I’d rather publish a basic version now and then come back to it with my gardening gloves on another time to improve it.
Maybe my blogroll has some folks on that you might recognise? Or else: maybe you’re only a single random-click away from somebody new you never heard of before!
1 Possibly marked up with XFN to indicate how you’re connected to one another, but I’ve always had a soft spot for XFN.
2 I often retain subscriptions to dormant feeds and it sometimes pays-off, e.g. when I recently celebrated Octopuns’ return after a 9½-year hiatus!
This is a repost promoting content originally published elsewhere. See more things Dan's reposted.
Automattic has acquired the ActivityPub plugin for WordPress from German developer Matthias Pfefferle, who will be joining the company to continue improving support for federated platforms. Pfefferle, who is also the author of the Webmention plugin, said his new role is to see how Automattic’s products can benefit from open protocols like ActivityPub.
…
This is so exciting I might burst. Want to know why?
Susan A. Kitchens expressed concern that this could increase the level of ActivityPub spam out there (which right now is very low). I worry about that too. But I’m still optimistic that we can make something awesome off the back of this acquisition and keep the interpersonal Web federated, the way it ought to be.
Terence describes a problem that I’ve wrestled with myself. If somebody comments directly on my blog using the form at the bottom of a post, that’s a pretty strong indicator of them giving their consent for their comment to be published at the bottom of that post (at my discretion). If somebody publicly replies somewhere my post is syndicated, that’s less-obvious, but still pretty clear. If somebody merely mentions my post publicly, writing their own post and linking to mine… that’s a real fuzzy area.
I take a minimal approach; only capturing their full content if it’s short and otherwise trying to extract a snippet that contains the bit that mentioned my content, and I think that works great. But Terence points out an important follow-up: what if the commenter deletes that content?
My approach so far has always been a reactive one – the second in Terence’s list – and I think it’s a morally-acceptable stance for a personal blogger. But I’m not sure it scales. I find myself asking: what if a news outlet did this, taking my self-published feedback to their story and publishing it on their site, even if I later amended, retracted, or deleted it on my own? If somebody’s making money out of my content, that feels different: I’ve always been clear that what I write on my blog is permissively-licensed, but that permissiveness is based on the prohibition of commercial use of my content.
Perhaps down the line this can be solved technologically: something machine-readable akin to the <link rel="license" ...>
tag could state an author’s preference for
how their content is syndicated by third parties they’ve mentioned, answering questions like:
Right now, the relevant technologies are not well-established enough to even begin this kind of work, but if a modern interconected federated web of personal websites takes off, it’s the kind of question we might one day have to answer.
For now my gut feeling is that option #2 (reactive moderation of syndicated comments) is ethically-sufficient for personal websites. But I’ll be watching the feedback Terence (who probably gets many more readers than I) receives in case my gut doesn’t represent the majority!
Just in time for Robin Sloan to give up on Spring ’83, earlier this month I finally got aroud to launching STS-6 (named for the first mission of the Space Shuttle Challenger in Spring 1983), my experimental Spring ’83 server. It’s been a busy year; I had other things to do. But you might have guessed that something like this had been under my belt when I open-sourced a keygenerator for the protocol the other day.
If you’ve not played with Spring ’83, this post isn’t going to make much sense to you. Sorry.
My server is, as far as I can tell, very different from any others in a few key ways:
{{board:123}}
which are
automatically converted to addresses referencing the public key of the “current” keypair for that board. This separates the concept of a board and its content template from that
board’s keypairs, making it easier to link to a board. To put it another way, STS-6 links are self-healing on the server-side (for local boards).
<link rel="next">
connections to keep them current.
I’m sure that there are those who would see this as automating something that was beautiful because it was handcrafted; I don’t know whether or not I agree, but had Spring ’83 taken off in a bigger way, it would always only have been a matter of time before somebody tried my approach.
From a design perspective, I enjoyed optimising an SVG image of my header so it could meaningfully fit into the board. It’s pretty, and it’s tolerably lightweight.
If you want to see my server in action, patch this into your favourite Spring ’83 client:
https://s83.danq.dev/10c3ff2e8336307b0ac7673b34737b242b80e8aa63ce4ccba182469ea83e0623
Without Robin’s active participation, I feel that Spring ’83 is probably coming to a dead end. It’s been a lot of fun to play with and I’d love to see what ideas the experience of it goes on to inspire next, but in its current form it’s one of those things that’s an interesting toy, but not something that’ll make serious waves.
In his last lab essay Robin already identified many of the key issues with the system (too complicated, no interpersonal-mentions, the challenge of keys-as-identifiers, etc.) and while they’re all solvable without breaking the underlying mechanisms (mentions might be handled by Webmention, perhaps, etc.), I understand the urge to take what was learned from this experiment and use it to help inform the decisions of the next one. Just as John Postel’s Quote of the Day protocol doesn’t see much use any more (although maybe if my finger server could support QotD?) but went on to inspire the direction of many subsequent “call-and-response” protocols, including HTTP, it’s okay if Spring ’83 disappears into obscurity, so long as we can learn what it did well and build upon that.
Meanwhile: if you’re looking for a hot new “like the web but lighter” protocol, you should probably check out Gemini. (Incidentally, you can find me at gemini://danq.me, but that’s something I’ll write about another day…)
In the light of the so-called “Twitter migration”, I’ve spent a lot of the last week helping people new to Mastodon/the Fediverse in general to understand it. Or at least, to understand how it’s different from Twitter.1
If you’re among those jumping ship, by the way, can I recommend that you do two things:
The experience has filled me with feelings, for which I really appreciated that Hugh Rundle found such great words to share, comparing the surge of new users to September(ish) 1993:
…
The tools, protocols and culture of the fediverse were built by trans and queer feminists. Those people had already started to feel sidelined from their own project when people like me started turning up a few year ago. This isn’t the first time fediverse users have had to deal with a significant state change and feeling of loss. Nevertheless, the basic principles have mostly held up to now: the culture and technical systems were deliberately designed on principles of consent, agency, and community safety.
…
If the people who built the fediverse generally sought to protect users, corporate platforms like Twitter seek to control their users… [Academics and advertisers] can claim that legally Twitter has the right to do whatever it wants with this data, and ethically users gave permission for this data to be used in any way when they ticked “I agree” to the Terms of Service.
…
This attitude has moved with the new influx. Loudly proclaiming that content warnings are censorship, that functionality that has been deliberately unimplemented due to community safety concerns are “missing” or “broken”, and that volunteer-run servers maintaining control over who they allow and under what conditions are “exclusionary”. No consideration is given to why the norms and affordances of Mastodon and the broader fediverse exist, and whether the actor they are designed to protect against might be you.
…
I’d highly recommend you read the whole thing because it’s excellent.
I genuinely believe that the fediverse is among our best bets for making a break from the silos of the corporate Web, and to do that it has to scale – it’s only the speed at which it’s being asked to do so that’s problematic.
Aside from what I’m already doing – trying to tutor (tootor?) new fediversians about how to integrate in an appropriate and respectful manner and doing a little to supporting the expansion of the software that makes it tick… I wonder what more I could/should be doing.
Would my effort be best-spent be running a server (one not-just-for-me, I mean: abnib.social, anyone?), or should I use that time and money to support existing instances directly? Should I brush up on my ActivityPub spec so I can be a more-useful developer, or am I better-placed to focus on tending my own digital garden first? Or maybe I’m looking at it all wrong and I should be trying to dissuade people from piling-on to a system that might well not be right for them (nor they for it!)?
I don’t know the answers to these questions, but I’m hoping to work them out soon.
It only occurred to me after the fact that I should mention that you can find me at @dan@danq.me.
1 Important: I’m no expert. I’ve been doing fediverse things for about 3 years but I’m relatively quiet on Mastodon. Also, I’ve never really understood or gotten along with Twitter, so I’m even less an expert on that. Don’t assume that I’m an authority on anything at all, and especially not social media.
I’ve been spending a while running on reduced brain capacity lately so, to ease myself back into thinking like a programmer, I upgraded my preferred feed reader FreshRSS to version 1.20.0 – which was released a couple of weeks ago – and tried out what I believe is its killer new feature: HTML + XPath scraping.
I’ve been using RSS1 for about 20 years and I love it. It feels great to be able to curate my updates based on “what I care about”, and not on “what some social network thinks I should care about”, to keep things to read later, to prioritise effectively based on my own categorisation, to consume content offline and have my to-read list synchronise later, etc.
RSS never went away, of course (what do you think a podcast is?), but it got steamrollered out of the public eye by big companies who make their money out of keeping your eyes on their platforms and off the open Web. But it feels like it’s slowly coming back: even Substack – whose entire thing is that an email client is more-convenient than a feed reader for most people – launched an RSS reader this week!
I love RSS so much that I routinely retrofit other people’s websites with feeds just so I can subscribe to them: I even published the tool I use to do so! Whether filtering sports headlines out of BBC News, turning retro webcomics into “reading lists” so I can track my progress, or just working around sites that really should have feeds but refuse to, I just love sidestepping these “missing feeds”. My friend Beverley has a blog without any kind of feed, so I added one so I could subscribe to it. Magic.
But with FreshRSS 1.20.0, I no longer have to maintain my own tool to get this brilliant functionality, and I’m overjoyed. Let’s look at how it works by re-subscribing to Beverley’s blog but without a middleware tool.
In the latest version of FreshRSS, when you add a new feed to your reader, a new section “Type of feed source” is available. Unfold it, and you can change from the default (“RSS / Atom”) to the new option “HTML + XPath (Web scraping)”. Put a human-readable page address rather than a feed address into the “Feed URL” field and fill these fields to tell FreshRSS how to parse the page to get the content you want. Note that it doesn’t matter if the web page isn’t valid XML (e.g. missing closing tags) because it’s going to get run through PHP’s DOMDocument anyway which will “correct” for some really sloppy code if needed.
You’ll need to use XPath to express how to find a “feed item” on the page. Here’s the rules I used for https://webdevbev.co.uk/blog.html (many of these fields were optional – I didn’t have to do this much work)://h1
<h1>
on the page, and it can be
considered the “title” of the feed.
//li[@class="blog__post-preview"]
<li class="blog__post-preview">
.
descendant::h2
<h2>
which is the post title. The descendant:: selector scopes the search to each post as found above.
descendant::p[3]
<li>
, which we can select like this.
descendant::h2/a/@href
<h2><a
href="...">
, rather than its contents.
descendant::img[@class="blog__image--preview"]/@src
<img src="...">
.
"Beverley Newing"
substring-after(descendant::p[@class="blog__date-posted"], "Date posted: ")
substring-after
function to strtip this. The result gets passed to PHP’s
strtotime()
, which is pretty tolerant of different date formats (although not of the words “Date posted:” it turns out!).
I hope that this is just the beginning for this new killer feature in FreshRSS: there’s so much more it can be and do. But for now, I’m still mighty impressed that I can begin to phase-out my use of my relatively resource-intensive feed-building middleware and use my feed reader to do more and more of the heavy lifting for which I love it so much.
I also love that this functionally adds h-feed support in by the back door. I’d still prefer there to be a “h-feed” option in the “Type of feed source” drop-down, but at least I can add such support manually, now!
Via Jeremy Keith I today discovered Jim Nielsen‘s suggestion for a website’s /.well-known/links
to be a place where it can host a JSON-formatted list of all of its outgoing links.
That’s a really useful thing to have in this new age of the web, where Refererer:
headers are no-longer commonly passed cross-domain and Google Search no longer provides the link:
operator. If you want to know if I’ve ever
linked to your site, it’s a bit of a drag to find out.
So, obviously, I’ve written an implementation for WordPress. It’s really basic right now, but the source code can be
found here if you want it. Install it as a plugin and run wp outbound-links
to kick it off. It’s fast: it takes 3-5 seconds to parse the entirety of danq.me,
and I’ve got somewhere in the region of 5,000 posts to parse.
You can see the results at https://danq.me/.well-known/links – if you’ve ever wondered “has Dan ever linked to my site?”, now you can find the answer.
If this could be useful to you, let’s collaborate on making this into an actually-useful plugin! Otherwise it’ll just languish “as-is”, which is good enough for my purposes.
As you might know if you were paying close attention in Summer 2019, I run a “URL shortener” for my personal use. You may be familiar with public URL shorteners like TinyURL and Bit.ly: my personal URL shortener is basically the same thing, except that only I am able to make short-links with it. Compared to public ones, this means I’ve got a larger corpus of especially-short (e.g. 2/3 letter) codes available for my personal use. It also means that I’m not dependent on the goodwill of a free siloed service and I can add exactly the features I want to it.
For the last nine years my link shortener has been S.2, a tool I threw together in Ruby. It stores URLs in a
sequentially-numbered database table and then uses the Base62-encoding of the primary key as the “code” part of the short URL. Aside from the fact that when I create a short link it shows me a QR code to I can
easily “push” a page to my phone, it doesn’t really have any “special” features. It replaced S.1, from which it primarily differed by putting the code at the end of the URL rather than as part of the domain name, e.g. s.danq.me/a0
rather than a0.s.danq.me
: I made the switch
because S.1 made HTTPS a real pain as well as only supporting Base36 (owing to the case-insensitivity of domain names).
But S.2’s gotten a little long in the tooth and as I’ve gotten busier/lazier, I’ve leant into using or adapting open source tools more-often than writing my own from scratch. So this week I switched my URL shortener from S.2 to YOURLS.
One of the things that attracted to me to YOURLS was that it had a ready-to-go Docker image. I’m not the biggest fan of Docker in general, but I do love the convenience of being able to deploy applications super-quickly to my household NAS. This makes installing and maintaining my personal URL shortener much easier than it used to be (and it was pretty easy before!).
Another thing I liked about YOURLS is that it, like S.2, uses Base62 encoding. This meant that migrating my links from S.2 into YOURLS could be done with a simple cross-database
INSERT... SELECT
statement:
INSERT INTO yourls.yourls_url(keyword, url, title, `timestamp`, clicks) SELECT shortcode, url, title, created_at, 0 FROM danq_short.links
But do you know what’s a bigger deal for my lifestack than my URL shortener? My RSS reader! I’ve written about it a lot, but I use RSS for just about everything and my feed reader is my first, last, and sometimes only point of contact with the Web! I’m so hooked-in to my RSS ecosystem that I’ll use my own middleware to add feeds to sites that don’t have them, or for which I’m not happy with the feed they provide, e.g. stripping sports out of BBC News, subscribing to webcomics that don’t provide such an option (sometimes accidentally hacking into sites on the way), and generating “complete” archives of series’ of posts so I can use my reader to track my progress.
One of S.1/S.2’s features was that it exposed an RSS feed at a secret URL for my reader to ingest. This was great, because it meant I could “push” something to my RSS reader to read or repost to my blog later. YOURLS doesn’t have such a feature, and I couldn’t find anything in the (extensive) list of plugins that would do it for me. I needed to write my own.
I could have written a YOURLS plugin. Or I could have written a stack of code in Ruby, PHP, Javascript or
some other language to bridge these systems. But as I switched over my shortlink subdomain s.danq.me
to its new home at danq.link
, another idea came to me. I
have direct database access to YOURLS (and the table schema is super simple) and the command-line MariaDB client can output XML… could I simply write an XML
Transformation to convert database output directly into a valid RSS feed? Let’s give it a go!
I wrote a script like this and put it in my crontab:
mysql --xml yourls -e \ "SELECT keyword, url, title, DATE_FORMAT(timestamp, '%a, %d %b %Y %T') AS pubdate FROM yourls_url ORDER BY timestamp DESC LIMIT 30" \ | xsltproc template.xslt - \ | xmllint --format - \ > output.rss.xml
The first part of that command connects to the yourls
database, sets the output format to XML, and executes an
SQL statement to extract the most-recent 30 shortlinks. The DATE_FORMAT
function is used to mould the datetime into
something approximating the RFC-822 standard for datetimes as required by
RSS. The output produced looks something like this:
<?xml version="1.0"?> <resultset statement="SELECT keyword, url, title, timestamp FROM yourls_url ORDER BY timestamp DESC LIMIT 30" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <row> <field name="keyword">VV</field> <field name="url">https://webdevbev.co.uk/blog/06-2021/perfect-is-the-enemy-of-good.html</field> <field name="title"> Perfect is the enemy of good || Web Dev Bev</field> <field name="timestamp">2021-09-26 17:38:32</field> </row> <row> <field name="keyword">VU</field> <field name="url">https://webdevlaw.uk/2021/01/30/why-generation-x-will-save-the-web/</field> <field name="title">Why Generation X will save the web Hi, Im Heather Burns</field> <field name="timestamp">2021-09-26 17:38:26</field> </row> <!-- ... etc. ... --> </resultset>
We don’t see this, though. It’s piped directly into the second part of the command, which uses xsltproc
to apply an XSLT to it. I was concerned that my XSLT
experience would be super rusty as I haven’t actually written any since working for my former employer SmartData back in around 2005! Back then, my coworker Alex and I spent many hours doing XML
backflips to implement a system that converted complex data outputs into PDF files via an XSL-FO intermediary.
I needn’t have worried, though. Firstly: it turns out I remember a lot more than I thought from that project a decade and a half ago! But secondly, this conversion from MySQL/MariaDB
XML output to RSS turned out to be pretty painless. Here’s the
template.xslt
I ended up making:
<?xml version="1.0"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:template match="resultset"> <rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"> <channel> <title>Dan's Short Links</title> <description>Links shortened by Dan using danq.link</description> <link> [ MY RSS FEED URL ] </link> <atom:link href=" [ MY RSS FEED URL ] " rel="self" type="application/rss+xml" /> <lastBuildDate><xsl:value-of select="row/field[@name='pubdate']" /> UTC</lastBuildDate> <pubDate><xsl:value-of select="row/field[@name='pubdate']" /> UTC</pubDate> <ttl>1800</ttl> <xsl:for-each select="row"> <item> <title><xsl:value-of select="field[@name='title']" /></title> <link><xsl:value-of select="field[@name='url']" /></link> <guid>https://danq.link/<xsl:value-of select="field[@name='keyword']" /></guid> <pubDate><xsl:value-of select="field[@name='pubdate']" /> UTC</pubDate> </item> </xsl:for-each> </channel> </rss> </xsl:template> </xsl:stylesheet>
That uses the first (i.e. most-recent) shortlink’s timestamp as the feed’s pubDate
, which makes sense: unless you’re going back and modifying links there’s no more-recent
changes than the creation date of the most-recent shortlink. Then it loops through the returned rows and creates an <item>
for each; simple!
The final step in my command runs the output through xmllint
to prettify it. That’s not strictly necessary, but it was useful while debugging and as the whole command takes
milliseconds to run once every quarter hour or so I’m not concerned about the overhead. Using these native binaries (plus a little configuration), chained together with pipes, had
already resulted in way faster performance (with less code) than if I’d implemented something using a scripting language, and the result is a reasonably elegant “scratch your
own itch”-type solution to the only outstanding barrier that was keeping me on S.2.
All that remained for me to do was set up a symlink so that the resulting output.rss.xml
was accessible, over the web, to my RSS reader. I hope that next time I’m tempted to write a script to solve a problem like this I’ll remember that sometimes a chain of piped *nix
utilities can provide me a slicker, cleaner, and faster solution.
Update: Right as I finished writing this blog post I discovered that somebody had already solved this problem using PHP code added to YOURLS; it’s just not packaged as a plugin so I didn’t see it earlier! Whether or not I use this alternate approach or stick to what I’ve got, the process of implementing this YOURLS-database ➡ XML ➡ XSLT ➡ RSS chain was fun and informative.
When do I #blog? A breakdown of my top four #indieweb content types – articles, reposts, checkins and notes – by month.
Unsurprisingly my checkins, which represent #geocaching/#geohashing activity, grow in the spring and peak in the summer when the weather’s better!
At first I assumed the notes peak in November might have been thrown off by a single conference, e.g. musetech, but it turns out I’ve just done more note-friendly things in Novembers, like Challenge Robin II and my Cape Town meetup, which are enough to throw the numbers off.
Great advice.
After I got banned from Facebook in 2011 (for using a “fake name”, which is actually my real name) I took a similar line of thinking: I can’t trust Facebook (or Twitter, or Instagram, or whoever else) to be responsible custodians of my content, so I shan’t. Now, virtually all content I create is hosted on my WordPress-powered blog, at my own domain, first and foremost… and syndicated copies may appear on various social media.
In a very few instances I go the other way around, producing content in silos and then copying it back to my blog: e.g. my geocaching/geohashing expeditions are posted first to their respective sites (because it’s easiest and most-practical to do that using their apps, especially “in the field”), but then they get imported into my blog using a custom plugin. If any of these sites closes, deletes my data, adds paid tiers I’m not happy with, or just bans me from my own account… I’m still set.
Backing up all your social content is a good strategy. Owning it all to begin with is an even better one, IMHO. See also: Indieweb.