Importing Geocaching Logs into WordPress

Background

As an ocassional geocacher and geohasher, I’m encouraged to post logs describing my adventures, and each major provider wants me to post my logs into their silo (see e.g. my logs on geocaching.com, on opencache.uk, and on the geohashing wiki). But as a believer in the ideals behind the IndieWeb (since long before anybody said “IndieWeb”), I’m opposed to keeping the only copy of content that I produce in an environment controlled by somebody else (why?).

How do I reconcile this?

Wrist-mounted GPS in the snow.
Just another hundred metres to the cache, then it’s time to freeze my ass back to base.

What I’d prefer would be to be able to write my logs here, on my own blog, and for my content to by syndicated via some process into the logging systems of the various silo sites I prefer. This approach is called POSSE – Publish on Own Site, Syndicate Elsewhere. In addition to the widely-described benefits of this syndication strategy, such a system would also make it possible for me to:

  • write single posts amalgamating multiple locations (e.g. a geohashing expedition that included geocache finds) or,
  • write single posts that represent the same location published on multiple silos (e.g. a visit to a geocache published on two different listing sites [e.g. 1, 2])

Applying such an tool would require some work as different silos have different acceptable content rules (geocaching.com, for example, effectively forbids mention of the existence of other geocache listing sites), but that’d theoretically be workable.

POSSE would involve posts being made first to my blog and then converted via some process into logs in each relevant silo
The ideal solution would be POSSE-based.

Unfortunately, content rules aren’t the only factor making PESOS – writing content into each silo and then copying it to my blog – preferable to POSSE. There’s also:

  • Not all of the silos offer suitable (published) APIs, and where they do, the APIs are all distinctly different.
  • Geocaching.com specifically forbids the use of unapproved automated robots to access the site (and almost certainly wouldn’t approve the kind of tool that would be ideal).
  • The siloed services are well-supported by official and third-party apps with medium-specific logic which make them the best existing way to produce logs.
PESOS would mean that posts were made "the usual way" to the silos and then a process duplicates them onto my blog
A PESOS-based solution is far easier to implement, in this case.

Needless to say: as much as I’d have loved to POSSE my geo* logs, PESOS will do.

Implementation

My implementation is a WordPress plugin which does two things. The first is that it provides a Javascript bookmarklet and an accompanying dynamically-generated Javascript file (the former loads the latter) served from my blog’s domain. That Javascript file contains reference to every log already published to my blog, so that the Javascript code can deliberately omit these logs from any import. When executed on a log listing page like those linked above, it copies all of the details of that log into a form which submits them back to my blog, where it’s received by the second part of the plugin.

Geocache logs to WordPress importer seen running on geocaching.com
The import controls appear in a new, right-most column (GCVote is also visible running in my browser).

The second part of the plugin takes this data and creates a new draft post. My plugin is pretty opinionated on this part because it’s geared strongly towards my use-case, so if you want to use it yourself you’ll probably want to tweak the code a little (e.g. it applies specific tags and names metadata fields a particular way).

Import plugin running on OpenCache.uk
When run on OpenCache.uk effectively the same interface is presented, even though the underlying mechanisms and data locations are different.

It’s not fully-automated and it’s not POSSE,but it’s “good enough” and it’s enabled me to synchronise all of my cache logs to my blog. I’ve plans to extend it to support other GPS game services to streamline my de-siloisation even further.

And of course, I’ve open-sourced the whole thing. If it’s any use to you (probably in an adapted form), it’s all yours.

Bypassing WordPress / Jetpack’s “Prove your humanity:” CAPTCHA

One of the most-popular WordPress plugins is Jetpack, a product of Automattic (best-known for providing the widely-used WordPress hosting service “WordPress.com“). Among Jetpack’s features (many of which are very good) is Jetpack Protect which adds – among other things – the possibility for a CAPTCHA to appear on your login pages. This feature is slightly worse than pointless as it makes it harder for humans to log in but has no significant impact upon automated robots; at best, it provides a false sense of security and merely frustrates and slows down legitimate human editors.

WordPress/Jetpack's CAPTCHA, asking for the solution to "9+10="
Thanks, WordPress, for slowing me down with a CAPTCHA that a robot can solve more-easily than a human.

“Proving your humanity”, as you’re asked to do, is a task that’s significantly easier for a robot to perform than a human. Eventually, of course, all tests of this nature seem likely to fail as robots become smarter than humans (especially as the most-popular system is specifically geared towards training robots), but that’s hardly an excuse for inventing a system that was a failure from its inception. Jetpack’s approach is fundamentally flawed because it makes absolutely no effort to disguise the challenge in a way that humans are able to read any-differently than robots. I’ll demonstrate that in a moment.

Jetpack security settings: "Protect" switch
Don’t just disable this, though! Other “Protect” features make some sense. If only you could disable just the stupid one…

A while back, a colleague of mine network-enabled Jetpack Protect across a handful of websites that I occasionally need to log into, and it bugged me that it ‘broke’ my password safe’s ability to automatically log me in. So to streamline my workflow – as well as to demonstrate quite how broken Jetpack Protect’s CAPTCHA is, I’ve written a userscript that you can install into your web browser that will completely circumvent it, solving the maths problems on your behalf so that you don’t have to. Here’s how to use it:

  1. Install a userscript manager into your browser if you don’t have one already: I use Tampermonkey, but it ought to work with almost any of them.
  2. Install Jetpack Maths Solver.

From now on, whenever you go to a page whose web path begins with “/wp-login.php” that contains a Jetpack Protect maths problem, the answer will be automatically calculated and filled-in on your behalf. The usual userscript rules apply: if you don’t trust me, read the source code (there are really only five lines to check) and disable automatic updates for it (especially as it operates across all domains), and feel free to adapt/improve however you see fit. Maybe if we can get enough people using it Automattic will fix this half-hearted CAPTCHA – or at least give us a switch to disable it in the first place.

Update: 15 October 2018 – the latest version of Jetpack makes an insignificant change to this CAPTCHA; version 1.2 of this script (linked above) works around the change.