What the heck is going on with WordPress?

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

Let’s play a little game. 😉

Look at the following list of words and try to find the intruder:

  • wp-activate.php
  • wp-admin
  • wp-blog-header.php
  • wp_commentmeta
  • wp_comments
  • wp-comments-post.php
  • wp-config-sample.php
  • wp-content
  • wp-cron.php
  • wp engine
  • wp-includes
  • wp_jetpack_sync_queue
  • wp_links
  • wp-links-opml.php
  • wp-load.php
  • wp-login.php
  • wp-mail.php
  • wp_options
  • wp_postmeta
  • wp_posts
  • wp-settings.php
  • wp-signup.php
  • wp_term_relationships
  • wp_term_taxonomy
  • wp_termmeta
  • wp_terms
  • wp-trackback.php
  • wp_usermeta
  • wp_users

What are these words?

Well, all the ones that contain an underscore _ are names of the WordPress core database tables. All the ones that contain a dash - are WordPress core file or folder names. The one with a space is a company name…

A smart (if slightly tongue-in-cheek) observation by my colleague Paolo, there. The rest of his article’s cleverer and worth-reading if you’re following the WordPress Drama (but it’s pretty long!).

WP Engine’s Three Problems

Duration

Podcast Version

This post is also available as a podcast. Listen here, download for later, or subscribe wherever you consume podcasts.

If you’re active in the WordPress space you’re probably aware that there’s a lot of drama going on right now between (a) WordPress hosting company WP Engine, (b) WordPress hosting company (among quite a few other things) Automattic1, and (c) the WordPress Foundation.

If you’re not aware then, well: do a search across the tech news media to see the latest: any summary I could give you would be out-of-date by the time you read it anyway!

Illustration showing relationships between WordPress and Automattic (licensing trademarks and contributing effort to), between WordPress and WP Engine (the latter profits from the former), and between Automattic and WP Engine (throwing lawsuits at one another).
I tried to draw a better diagram with more of the relevant connections, but it quickly turned into spaghetti.

A declaration of war?

Like others, I’m not sure that the way Matt publicly called-out WP Engine at WCUS was the most-productive way to progress a discussion2.

In particular, I think a lot of the conversation that he kicked off conflates three different aspects of WP Engine’s misbehaviour. That muddies the waters when it comes to having a reasoned conversation about the issue3.

Matt Mullwenweg on stage at WordCamp US 2024, stating how he feels that WP Engine exploits WordPress (to great profit) without contributing back.
I’ve heard Matt speak a number of times, including in person… and I think he did a pretty bad job of expressing the problems with WP Engine during his Q&A at WCUS. In his defence, it sounds like he may have been still trying to negotiate a better way forward until the very second he walked on stage that day.

I don’t think WP Engine is a particularly good company, and I personally wouldn’t use them for WordPress hosting. That’s not a new opinion for me: I wouldn’t have used them last year or the year before, or the year before that either. And I broadly agree with what I think Matt tried to say, although not necessarily with the way he said it or the platform he chose to say it upon.

Misdeeds

As I see it, WP Engine’s potential misdeeds fall into three distinct categories: moral, ethical4, and legal.

Morally: don’t take without giving back

Matt observes that since WP Engine’s acquisition by huge tech-company-investor Silver Lake, WP Engine have made enormous profits from selling WordPress hosting as a service (and nothing else) while making minimal to no contributions back to the open source platform that they depend upon.

If true, and it appears to be, this would violate the principle of reciprocity. If you benefit from somebody else’s effort (and you’re able to) you’re morally-obliged to at least offer to give back in a manner commensurate to your relative level of resources.

Two children sit on a bed: one hands a toy dinosaur to the other.
The principle of reciprocity is a moral staple. This is evidenced by the fact that children (and some nonhuman animals) seem to be able to work it out for themselves from first principles using nothing more than empathy. Companies, however aren’t usually so-capable. Photo courtesy Cotton.

Abuse of this principle is… sadly not-uncommon in business. Or in tech. Or in the world in general. A lightweight example might be the many millions of profitable companies that host atop the Apache HTTP Server without donating a penny to the Apache Foundation. A heavier (and legally-backed) example might be Trump Social’s implementation being based on a modified version of Mastodon’s code: Mastodon’s license requires that their changes are shared publicly… but they don’t do until they’re sent threatening letters reminding them of their obligations.

I feel like it’s fair game to call out companies that act amorally, and encourage people to boycott them, so long as you do so without “punching down”.

Ethically: don’t exploit open source’s liberties as weaknesses

WP Engine also stand accused of altering the open source code that they host in ways that maximise their profit, to the detriment of both their customers and the original authors of that code5.

It’s well established, for example, that WP Engine disable the “revisions” feature of WordPress6. Personally, I don’t feel like this is as big a deal as Matt makes out that it is (I certainly wouldn’t go as far as saying “WP Engine is not WordPress”): it’s pretty commonplace for large hosting companies to tweak the open source software that they host to better fit their architecture and business model.

But I agree that it does make WordPress as-provided by WP Engine significantly less good than would be expected from virtually any other host (most of which, by the way, provide much better value-for-money at any price point).

Fake web screenshot showing turdpress.com, "WordPress... But Shit".
There’s nothing to stop me from registering TurdPress.com and providing a premium WordPress web hosting solution with all the best features disabled: I could even disable exports so that my customers wouldn’t even be able to easily leave my service for greener pastures! There’s nothing stop me… but that wouldn’t make it right7.
It also looks like WP Engine may have made more-nefarious changes, e.g. modifying the referral links in open source code (the thing that earns money for the original authors of that code) so that WP Engine can collect the revenue themselves when they deploy that code to their customers’ sites. That to me feels like it’s clearly into the zone ethical bad practice. Within the open source community, it’s not okay to take somebody’s code, which they were kind enough to release under a liberal license, strip out the bits that provide their income, and redistribute it, even just as a network service8.

Again, I think this is fair game to call out, even if it’s not something that anybody has a right to enforce legally. On which note…

Legally: trademarks have value, don’t steal them

Automattic Inc. has a recognised trademark on WooCommerce, and is the custodian of the WordPress Foundation’s trademark on WordPress. WP Engine are accused of unauthorised use of these trademarks.

Obviously, this is the part of the story you’re going to see the most news media about, because there’s reasonable odds it’ll end up in front of a judge at some point. There’s a good chance that such a case might revolve around WP Engine’s willingness (and encouragement?) to allow their business to be called “WordPress Engine” and to capitalise on any confusion that causes.

Screenshot from the WordPress Foundation's Trademark Policy page, with all but the first line highlighted of the paragraph that reads: The abbreviation “WP” is not covered by the WordPress trademarks, but please don’t use it in a way that confuses people. For example, many people think WP Engine is “WordPress Engine” and officially associated with WordPress, which it’s not. They have never once even donated to the WordPress Foundation, despite making billions of revenue on top of WordPress.
I don’t know how many people spotted this ninja-edit addition to the WordPress Foundation’s Trademark Policy page, but I did.

I’m not going to weigh in on the specifics of the legal case: I Am Not A Lawyer and all that. Naturally I agree with the underlying principle that one should not be allowed to profit off another’s intellectual property, but I’ll leave discussion on whether or not that’s what WP Engine are doing as a conversation for folks with more legal-smarts than I. I’ve certainly known people be confused by WP Engine’s name and branding, though, and think that they must be some kind of “officially-licensed” WordPress host: it happens.

If you’re following all of this drama as it unfolds… just remember to check your sources. There’s a lot of FUD floating around on the Internet right now9.

In summary…

With a reminder that I’m sharing my own opinion here and not that of my employer, here’s my thoughts on the recent WP Engine drama:

  1. WP Engine certainly act in ways that are unethical and immoral and antithetical to the spirit of open source, and those are just a subset of the reasons that I wouldn’t use them as a WordPress host.
  2. Matt Mullenweg calling them out at WordCamp US doesn’t get his point across as well as I think he hoped it might, and probably won’t win him any popularity contests.
  3. I’m not qualified to weigh in on whether or not WP Engine have violated the WordPress Foundation’s trademarks, but I suspect that they’ve benefitted from widespread confusion about their status.

Footnotes

1 I suppose I ought to point out that Automattic is my employer, in case you didn’t know, and point out that my opinions don’t necessarily represent theirs, etc. I’ve been involved with WordPress as an open source project for about four times as long as I’ve had any connection to Automattic, though, and don’t always agree with them, so I’d hope that it’s a given that I’m speaking my own mind!

2 Though like Manu, I don’t think that means that Matt should take the corresponding blog post down: I’m a digital preservationist, as might be evidenced by the unrepresentative-of-me and frankly embarrassing things in the 25-year archives of this blog!

3 Fortunately the documents that the lawyers for both sides have been writing are much clearer and more-specific, but that’s what you pay lawyers for, right?

4 There’s a huge amount of debate about the difference between morality and ethics, but I’m using the definition that means that morality is based on what a social animal might be expected to decide for themselves is right, think e.g. the Golden Rule etc., whereas ethics is the code of conduct expected within a particular community. Take stealing, for example, which covers the spectrum: that you shouldn’t deprive somebody else of something they need, is a moral issue; that we as a society deem such behaviour worthy of exclusion is an ethical one; meanwhile the action of incarcerating burglars is part of our legal framework.

5 Not that nobody’s immune to making ethical mistakes. Not me, not you, not anybody else. I remember when, back in 2005, Matt fucked up by injecting ads into WordPress (which at that point didn’t have a reliable source of funding). But he did the right thing by backpedalling, undoing the harm, and apologising publicly and profusely.

6 WP Engine claim that they disable revisions for performance reasons, but that’s clearly bullshit: it’s pretty obvious to me that this is about making hosting cheaper. Enabling revisions doesn’t have a performance impact on a properly-configured multisite hosting system, and I know this from personal experience of running such things. But it does have a significant impact on how much space you need to allocate to your users, which has cost implications at scale.

7 As an aside: if a court does rule that WP Engine is infringing upon WordPress trademarks and they want a new company name to give their service a fresh start, they’re welcome to TurdPress.

8 I’d argue that it is okay to do so for personal-use though: the difference for me comes when you’re making a profit off of it. It’s interesting to find these edge-cases in my own thinking!

9 A typical Reddit thread is about 25% lies-and-bullshit; but you can double that for a typical thread talking about this topic!

× × × × ×

Special Roads

Duration

Podcast Version

This post is also available as a podcast. Listen here, download for later, or subscribe wherever you consume podcasts.

Sometimes I’ve seen signs on dual carriageways and motorways that seem to specify a speed limit that’s the same as the national speed limit (i.e. 60 or 70 mph for most vehicles, depending on the type of road), which seem a bit… pointless? Today I learned why they’re there, and figured I’d share with you!

Google Street View photo from the A1 East of Edinburgh, showing a blue "No motor cycles under 50cc, moped,s invalid carriages and animals" sign alongside a 70mph sign.
The first time I saw this sign, on the A1 near Edinburgh, I wondered why it wasn’t just a national speed limit/derestriction sign. Now I know.

To get there, we need a history lesson.

As early as the 1930s, it was becoming clear that Britain might one day need a network of high-speed, motor-vehicle-only roads: motorways. The first experimental part of this network would be the Preston By-pass1.

Monochome photograph showing construction of bridge support pillars.
Construction halted on several occasions owing to heavy rain, and only six weeks after opening the road needed to be closed for resurfacing after the discovery that water had penetrated the material.

Construction wouldn’t actually begin until the 1950s, and it wasn’t just the Second World War that got in the way: there was a legislative challenge too.

When the Preston By-pass was first conceived, there was no legal recognition for roads that restricted the types of traffic that were permitted to drive on them. If a public highway were built, it would have to allow pedestrians, cyclists, and equestrians, which would doubtless undermine the point of the exercise! Before it could be built, the government needed to pass the Special Roads Act 1949, which enabled the designation of public roads as “special roads”, to which entry could be limited to certain classes of vehicles2.

Monochrome photograph showing a sign at the entrance to the Preston By-pass, reading: 'Motorway. NO L-drivers, mopeds, motorcycles under 50cc., invalid-carriages, pedal-cycles, pedestrians, animals'.
The original motorways had to spell out the regulations at their junctions.

If you don’t check your sources carefully when you research the history of special roads, you might be taken in by articles that state that special roads are “now known as motorways”, which isn’t quite true. All motorways are special roads, by definition, but not all special roads are motorways.

There’s maybe a dozen or more non-motorway special roads, based on research by Pathetic Motorways (whose site was amazingly informative on this entire subject). They tend to be used in places where something is like a motorway, but can’t quite be a motorway. In Manchester, a couple of the A57(M)’s sliproads have pedestrian crossings and so have to be designated special roads rather than motorways, for example3.

1968 Manchester City Council planning document showing their proposed new special roads.
“…is hereby varied by adding Class IX of the Classes of Traffic set out in Schedule 4 to the Highways Act 1980 as a class of traffic permitted to use those lengths of the special roads described in the Schedule to this Scheme and which…” /snoring sounds intensify/

Now we know what special roads are, that we might find them all over the place, and that they can superficially look like motorways, let’s talk about speed limits.

The Road Traffic Act 1934 introduced the concept of a 30mph “national speed limit” in built-up areas, which is still in force today. But outside of urban areas there was no speed limit. Perhaps there didn’t need to be, while cars were still relatively slow, but automobiles became increasingly powerful. The fastest speed ever legally achieved on a British motorway came in 1964 during a test by AC Cars, when driver Jack Sears reached 185mph.

Cyclists alongside a 'motorway' river bridge lane.
The “M48” Severn Bridge is another example of a special road that appears to be part of a motorway. The cycle lane and footpath (which is not separated from the main carriageway by more than a fence) is the giveaway that it’s not truly a “motorway” but a general-case special road.

In the late 1960s an experiment was run in setting a speed limit on motorways of 70mph. Then the experiment was extended. Then the regulation was made permanent.

There’ve been changes since then, e.g. to prohibit HGVs from going faster than 60mph, but fundamentally this is where Britain’s national speed limit on motorways comes from.

The Motorways Traffic (Speed Limit) (England) Regulations 1967, highlighting "3. No person shall drive a motor vehicle on a motorway at a speed greater than 70 miles per hour".
I assume that it relates to the devolution of transport policy or to the separation of legislation that it replaces, but separate-but-fundamentally-identical acts were passed for Scotland and Northern Ireland.

You’ve probably spotted the quirk already. When “special roads” were created, they didn’t have a speed limit. Some “special roads” were categorised as “motorways”, and “motorways” later had a speed limit imposed. But there are still a few non-motorway “special roads”!

Putting a national speed limit sign on a special road would be meaningless, because these roads have no centrally-legislated speed limit. So they need a speed limit sign, even if that sign, confusingly, might specify a speed limit that matches what you’d have expected on such a road4. That’s the (usual) reason why you sometimes see these surprising signs.

As to why this kind of road are much more-common in Scotland and Wales than they are anywhere else in the UK: that’s a much deeper-dive that I’ll leave as an exercise for the reader.

Footnotes

1 The Preston By-pass lives on, broadly speaking, as the M6 junctions 29 through 32.

2 There’s little to stop a local authority using the powers of the Special Roads Act and its successors to declare a special road accessible to some strange and exotic permutation of vehicle classes if they really wanted: e.g. a road could be designated for cyclists and horses but forbidden to motor vehicles and pedestrians, for example! (I’m moderately confident this has never happened.)

3 There’s a statutory instrument that makes those Mancunian sliproads possible, if you’re having trouble getting to sleep on a night and need some incredibly dry reading.

4 An interesting side-effect of these roads might be that speed restrictions based on the class of your vehicle and the type of road, e.g. 60mph for lorries on motorways, might not be enforceable on special roads. If you wanna try driving your lorry at 70mph on a motorway-like special road with “70” signs, though, you should do your own research first; don’t rely on some idiot from the Internet. I Am Not A Lawyer etc. etc.

× × × × × ×

Window Tax

Duration

Podcast Version

This post is also available as a podcast. Listen here, download for later, or subscribe wherever you consume podcasts.

…in England and Wales

From 1696 until 1851 a “window tax” was imposed in England and Wales1. Sort-of a precursor to property taxes like council tax today, it used an estimate of the value of a property as an indicator of the wealth of its occupants: counting the number of windows provided the mechanism for assessment.

Graph showing the burden of window tax in 1696 and 1794. In the former year a flat rate of 1 shiling was charged, doubling for a property when it reached 10 and 20 windows respectively. In the latter year charging began at 10 windows and the price per-window jumped up at 15 at 20 windows. Both approaches result in a "stepped" increase.
The hardest thing about retrospectively graphing the cost of window tax is thinking in “old money”2.
Window tax replaced an earlier hearth tax, following the ascension to the English throne of Mary II and William III of Orange. Hearth tax had come from a similar philosophy: that you can approximate the wealth of a household by some aspect of their home, in this case the number of stoves and fireplaces they had.

(A particular problem with window tax as enacted is that its “stepping”, which was designed to weigh particularly heavily on the rich with their large houses, was that it similarly weighed heavily on large multi-tenant buildings, whose landlord would pass on those disproportionate costs to their tenants!)

1703 woodcut showing King William III and Queen Mary II.
It’d be temping to blame William and Mary for the window tax, but the reality is more-complex and reflects late renaissance British attitudes to the limits of state authority.

Why a window tax? There’s two ways to answer that:

  • A window tax – and a hearth tax, for that matter – can be assessed without the necessity of the taxpayer to disclose their income. Income tax, nowadays the most-significant form of taxation in the UK, was long considered to be too much of an invasion upon personal privacy3.
  • But compared to a hearth tax, it can be validated from outside the property. Counting people in a property in an era before solid recordkeeping is hard. Counting hearths is easier… so long as you can get inside the property. Counting windows is easier still and can be done completely from the outside!
Dan points to a bricked-up first storey window on a stone building used by a funeral services company.
If you’re in Britain, finding older buildings with windows bricked-up to save on tax is pretty easy. I took a break from writing this post, walked for three minutes, and found one.4

…in the Netherlands

I recently got back from a trip to Amsterdam to meet my new work team and get to know them better.

Dan, by a game of table football, throws his arms into the air as if in self-celebration.
There were a few work-related/adjacent activities. But also a table football tournament, among other bits of fun.

One of the things I learned while on this trip was that the Netherlands, too, had a window tax for a time. But there’s an interesting difference.

The Dutch window tax was introduced during the French occupation, under Napoleon, in 1810 – already much later than its equivalent in England – and continued even after he was ousted and well into the late 19th century. And that leads to a really interesting social side-effect.

Dan, with four other men, sit in the back of a covered boat on a canal.
My brief interest in 19th century Dutch tax policy was piqued during my team’s boat tour.

Glass manufacturing technique evolved rapidly during the 19th century. At the start of the century, when England’s window tax law was in full swing, glass panes were typically made using the crown glass process: a bauble of glass would be spun until centrifugal force stretched it out into a wide disk, getting thinner towards its edge.

The very edge pieces of crown glass were cut into triangles for use in leaded glass, with any useless offcuts recycled; the next-innermost pieces were the thinnest and clearest, and fetched the highest price for use as windows. By the time you reached the centre you had a thick, often-swirly piece of glass that couldn’t be sold for a high price: you still sometimes find this kind among the leaded glass in particularly old pub windows5.

Multi-pane window with distinctive crown glass "circles".
They’re getting rarer, but I’ve lived in houses with small original panes of crown glass like these!

As the 19th century wore on, cylinder glass became the norm. This is produced by making an iron cylinder as a mould, blowing glass into it, and then carefully un-rolling the cylinder while the glass is still viscous to form a reasonably-even and flat sheet. Compared to spun glass, this approach makes it possible to make larger window panes. Also: it scales more-easily to industrialisation, reducing the cost of glass.

The Dutch window tax survived into the era of large plate glass, and this lead to an interesting phenomenon: rather than have lots of windows, which would be expensive, late-19th century buildings were constructed with windows that were as large as possible to maximise the ratio of the amount of light they let in to the amount of tax for which they were liable6.

Hotel des Pays-Bas, Nieuwe Doelenstraat 11 (1910 photo), showing large windows.
Look at the size of those windows! If you’re limited in how many you can have, but you’ve got the technology, you’re going to make them as large as you possibly can!

That’s an architectural trend you can still see in Amsterdam (and elsewhere in Holland) today. Even where buildings are renovated or newly-constructed, they tend – or are required by preservation orders – to mirror the buildings they neighbour, which influences architectural decisions.

Pre-WWI Neighbourhood gathering in Amsterdam, with enormous windows (especially on the ground floor) visible.
Notice how each building has only between one and three windows on the ground floor, letting as much light in while minimising the tax burden.

It’s really interesting to see the different architectural choices produced in two different cities as a side-effect of fundamentally the same economic choice, resulting from slightly different starting conditions in each (a half-century gap and a land shortage in one). While Britain got fewer windows, the Netherlands got bigger windows, and you can still see the effects today.

…and social status

But there’s another interesting this about this relatively-recent window tax, and that’s about how people broadcast their social status.

Modern photo, taken from the canal, showing a tall white building in Amsterdam with large windows on the ground floor and also basement level, and an ornamental window above the front door. Photo from Google Street View.
This Google Street Canal (?) View photo shows a house on Keizersgracht, one of the richest parts of Amsterdam. Note the superfluous decorative window above the front door and the basement-level windows for the servants’ quarters.

In some of the traditionally-wealthiest parts of Amsterdam, you’ll find houses with more windows than you’d expect. In the photo above, notice:

  • How the window density of the central white building is about twice that of the similar-width building on the left,
  • That a mostly-decorative window has been installed above the front door, adorned with a decorative leaded glass pattern, and
  • At the bottom of the building, below the front door (up the stairs), that a full set of windows has been provided even for the below-ground servants quarters!

When it was first constructed, this building may have been considered especially ostentatious. Its original owners deliberately requested that it be built in a way that would attract a higher tax bill than would generally have been considered necessary in the city, at the time. The house stood out as a status symbol, like shiny jewellery, fashionable clothes, or a classy car might today.

Cheerful white elderly man listening to music through headphones that are clearly too large for him.
I originally wanted to insert a picture here that represented how one might show status through fashion today. But then I remembered I don’t know anything about fashion7. But somehow my stock image search suggested this photo, and I love it so much I’m using it anyway. You’re welcome.
How did we go wrong? A century and a bit ago the super-wealthy used to demonstrate their status by showing off how much tax they can pay. Nowadays, they generally seem more-preoccupied with getting away with paying as little as possible, or none8.

Can we bring back 19th-century Dutch social status telegraphing, please?9

Footnotes

1 Following the Treaty of Union the window tax was also applied in Scotland, but Scotland’s a whole other legal beast that I’m going to quietly ignore for now because it doesn’t really have any bearing on this story.

2 The second-hardest thing about retrospectively graphing the cost of window tax is finding a reliable source for the rates. I used an archived copy of a guru site about Wolverhampton history.

3 Even relatively-recently, the argument that income tax might be repealed as incompatible with British values shows up in political debate. Towards the end of the 19th century, Prime Ministers Disraeli and Gladstone could be relied upon to agree with one another on almost nothing, but both men spoke at length about their desire to abolish income tax, even setting out plans to phase it out… before having to cancel those plans when some financial emergency showed up. Turns out it’s hard to get rid of.

4 There are, of course, other potential reasons for bricked-up windows – even aesthetic ones – but a bit of a giveaway is if the bricking-up reduces the number of original windows to 6, 9, 14 or 19, which are thesholds at which the savings gained by bricking-up are the greatest.

5 You’ve probably heard about how glass remains partially-liquid forever and how this explains why old windows are often thicker at the bottom. You’ve probably also already had it explained to you that this is complete bullshit. I only mention it here to preempt any discussion in the comments.

6 This is even more-pronounced in cities like Amsterdam where a width/frontage tax forced buildings to be as tall and narrow and as close to their neighbours as possible, further limiting opportunities for access to natural light.

7 Yet I’m willing to learn a surprising amount about Dutch tax law of the 19th century. Go figure.

8 Obligatory Pet Shop Boys video link. Can that be a thing please?

9 But definitely not 17th-century Dutch social status telegraphing, please. That shit was bonkers.

× × × × × × × × × ×

Minification vs the GPL

A not-entirely-theoretical question about open source software licensing came up at work the other day. I thought it was interesting enough to warrant a quick dive into the philosophy of minification, and how it relates to copyleft open source licenses. Specifically: does distributing (only) minified source code violate the GPL?

If you’ve come here looking for a legally-justifiable answer to that question, you’re out of luck. But what I can give you is a (fictional) story:

TheseusJS is slow

TheseusJS is a (fictional) Javascript library designed to be run in a browser. It’s released under the GPLv3 license. This license allows you to download and use TheseusJS for any purpose you like, including making money off it, modifying it, or redistributing it to others… but it requires that if you redistribute it you have to do so under the same license and include the source code. As such, it forces you to share with others the same freedoms you enjoy for yourself, which is highly representative of some schools of open-source thinking.

Screenshot showing TheseusJS's GitHub page. The project hasn't been updated in a year, and that was just to add a license: no code has been changed in 12 years.
It’s a cool project, but it really needs some maintenance this side of 2010.

It’s a great library and it’s used on many websites, but its performance isn’t great. It’s become infamous for the impact it has on the speed of the websites it’s used on, and it’s often the butt of jokes by developers: “Man, this website’s slow. Must be running Theseus!”

The original developer has moved onto his new project, Moralia, and seems uninterested in handling the growing number of requests for improvements. So I’ve decided to fork it and make my own version, FastTheseusJS and work on improving its speed.

FastTheseusJS is fast

I do some analysis and discover the single biggest problem with TheseusJS is that the Javascript file itself is enormous. The original developer kept all of the copious documentation in comments in the file itself, and for some reason it doesn’t even compress well. When you use TheseusJS on a website it takes a painfully long time for a browser to download it, if it’s not precached.

Screenshot showing a website for the TheseusJS API. It's pretty labyrinthine (groan).
Nobody even uses the documentation in the comments: there’s a website with a fully-documented API.

My first release of FastTheseusJS, then, removes virtually of the comments, replacing them with a single comment at the top pointing developers to a website where the API is fully documented. While I’m in there anyway, I also fix a minor bug that’s been annoying me for a while.

v1.1.0 changes

  • Forked from TheseusJS v1.0.4
  • Fixed issue #1071 (running mazeSolver() without first connecting <String> component results in endless loop)
  • Removed all comments: improves performance considerably

I discover another interesting fact: the developer of TheseusJS used a really random mixture of tabs and spaces for indentation, sometimes in the same line! It looks… okay if you set your editor up just right, but it’s pretty hideous otherwise. That whitespace is unnecessary anyway: the codebase is sprawling but it seldom goes more than two levels deep, so indentation levels don’t add much readability. For my second release of FastTheseusJS, then, I remove this extraneous whitespace, as well as removing the in-line whitespace inside parameter lists and the components of for loops. Every little helps, right?

v1.1.1 changes

  • Standardised whitespace usage
  • Removed unnecessary whitespace

Some of the simpler functions now fit onto just a single line, and it doesn’t even inconvenience me to see them this way: I know the codebase well enough by now that it’s no disadvantage for me to edit it in this condensed format.

Screenshot of a block of Javascript code intended using semicolons rather than tabs or spaces.
Personally, I’ve given up on the tabs-vs-spaces debate and now I indent my code using semicolons. (That’s clearly a joke. Don’t flame me.)

In the next version, I shorten the names of variables and functions in the code.

For some reason, the original developer used epic rambling strings for function names, like the well-known function dedicateIslandTempleToTheImageOfAGodBeforeOrAfterMakingASacrificeWithOrWithoutDancing( boolBeforeMakingASacrifice, objectImageOfGodToDedicateIslandTempleTo, stringNmeOfPersonMakingDedication, stringOrNullNameOfLocalIslanderDancedWith). That one gets called all the time internally and isn’t exposed via the external API so it might as well be shortened to d=(i,j,k,l,m)=>. Now all the internal workings of the library are each represented with just one or two letters.

v1.1.2 changes

  • Shortened/standarised non-API variable and function names – improves performance

I’ve shaved several kilobytes off the monstrous size of TheseusJS and I’m very proud. The original developer says nice things about my fork on social media, resulting in a torrent of downloads and attention. Within a certain archipelago of developers, I’m slightly famous.

But did I violate the license?

But then a developer says to me: you’re violating the license of the original project because you’re not making the source code available!

A man in a suit sits outdoors with a laptop and a cup of coffee. He is angry and frustrated, and a bubble shows that he is thinking:"why can't people respect the fucking license?!"
This happens every day. Probably not to this same guy every time though, but you never know. Original photo by Andrea Piacquadio.

They claim that my bugfix in the first version of FastTheseusJS represents a material change to the software, and that the changes I’ve made since then are obfuscation: efforts short of binary compilation that aim to reduce the accessibility of the source code. This fails to meet the GPL‘s definition of source code as “the preferred form of the work for making modifications to it”. I counter that this condensed view of the source code is my “preferred” way of working with it, and moreover that my output is not the result of some build step that makes the code harder to read, the code is just hard to read as a result of the optimisations I’ve made. In ambiguous cases, whose “preference” wins?

Did I violate the license? My gut feeling is that no, all of my changes were within the spirit and the letter of the GPL (they’re a terrible way to write code, but that’s not what’s in question here). Because I manually condensed the code, did so with the intention that this condensing was a feature, and continue to work directly with the code after condensing it because I prefer it that way… that feels like it’s “okay”.

But if I’d just run the code through a minification tool, my opinion changes. Suppose I’d run minify --output fasttheseus.js theseus.js and then deleted my copy of theseus.js. Then, making changes to fasttheseus.js and redistributing it feels like a violation to me… even if the resulting code is the same as I’d have gotten via the “manual” method!

I don’t know the answer (IANAL), but I’ll tell you this: I feel hypocritical for saying one piece of code would not violate the license but another identical piece of code would, based only on the process the developer followed to produce it. If I replace one piece of code at a time with less-readable versions the license remains intact, but if I replace them all at once it doesn’t? That doesn’t feel concrete nor satisfying.

Screenshot showing highly-minified HTML code (for this page) which is still reasonably readable.
Sure, I can write a blog post in just one line of code. It’ll just be a really, really, really long line… (Still perfectly readable, though!)

This isn’t an entirely contrived example

This example might seem highly contrived, and that’s because it is. But the grey area between the extremes is where the real questions are. If you agree that redistribution of (only) minified source code violates the GPL, you’re left asking: at what point does the change occur? Code isn’t necessarily minified or not-minified: there are many intermediate steps.

If I use a correcting linter to standardise indentation and whitespace – switching multiple spaces for the appropriate number of tabs, removing excess line breaks etc. (or do the same tasks manually) I’m sure you’d agree that’s fine. If I have it replace whole-function if-blocks with hoisted return statements, that’s probably fine too. If I replace if blocks with ternery operators or remove or shorten comments… that might be fine, but probably depends upon context. At some point though, some way along the process, minification goes “too far” and feels like it’s no longer within the limitations of the license. And I can’t tell you where that point is!

This issue’s even more-complicated with some other licenses, e.g. the AGPL, which extends the requirement to share source code to hosted applications. Suppose I implement a web application that uses an AGPL-licensed library. The person who redistributed it to me only gave me the minified version, but they gave me a web address from which to acquire the full source code, so they’re in the clear. I need to make a small patch to the library to support my service, so I edit it right into the minified version I’ve already got. A user of my hosted application asks for a copy of the source code, so I provide it, including the edited minified library… am I violating the license for not providing the full, unminified version, even though I’ve never even seen it? It seems absurd to say that I would be, but it could still be argued to be the case.

Diagram showing how permissive software licenses are generally compatible for use in LGPL or MPL licensed software, which are then compatible for use (except MPL) in GPL licensed software, which are in turn compatible for use (except GPL 2) with AGPL licensed software.
I love diagrams like this, which show license compatibility of different open source licenses. Adapted from a diagram by Carlo Daffara, in turn adapted from a diagram by David E. Wheeler, used under a CC-BY-SA license.

99% of the time, though, the answer’s clear, and the ambiguities shown above shouldn’t stop anybody from choosing to open-source their work under GPL, AGPL (or any other open source license depending on their preference and their community). Perhaps the question of whether minification violates the letter of a copyleft license is one of those Potter Stewart “I know it when I see it” things. It certainly goes against the spirit of the thing to do so deliberately or unnecessarily, though, and perhaps it’s that softer, more-altruistic goal we should be aiming for.

× × × × × ×

Exploiting vulnerabilities in Cellebrite UFED and Physical Analyzer from an app’s perspective

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

Cellebrite makes software to automate physically extracting and indexing data from mobile devices. They exist within the grey – where enterprise branding joins together with the larcenous to be called “digital intelligence.” Their customer list has included authoritarian regimes in Belarus, Russia, Venezuela, and China; death squads in Bangladesh; military juntas in Myanmar; and those seeking to abuse and oppress in Turkey, UAE, and elsewhere. A few months ago, they announced that they added Signal support to their software.

Their products have often been linked to the persecution of imprisoned journalists and activists around the world, but less has been written about what their software actually does or how it works. Let’s take a closer look. In particular, their software is often associated with bypassing security, so let’s take some time to examine the security of their own software.

Moxie Marlinspike (Signal)

Recently Moxie, co-author of the Signal Protocol, came into possession of a Cellebrite Extraction Device (phone cracking kit used by law enforcement as well as by oppressive regimes who need to clamp down on dissidents) which “fell off a truck” near him. What an amazing coincidence! He went on to report, this week, that he’d partially reverse-engineered the system, discovering copyrighted code from Apple – that’ll go down well! – and, more-interestingly, unpatched vulnerabilities. In a demonstration video, he goes on to show that a carefully crafted file placed on a phone could, if attacked using a Cellebrite device, exploit these vulnerabilities to take over the forensics equipment.

Obviously this is a Bad Thing if you’re depending on that forensics kit! Not only are you now unable to demonstrate that the evidence you’re collecting is complete and accurate, because it potentially isn’t, but you’ve also got to treat your equipment as untrustworthy. This basically makes any evidence you’ve collected inadmissible in many courts.

Moxie goes on to announce a completely unrelated upcoming feature for Signal: a minority of functionally-random installations will create carefully-crafted files on their devices’ filesystem. You know, just to sit there and look pretty. No other reason:

In completely unrelated news, upcoming versions of Signal will be periodically fetching files to place in app storage. These files are never used for anything inside Signal and never interact with Signal software or data, but they look nice, and aesthetics are important in software. Files will only be returned for accounts that have been active installs for some time already, and only probabilistically in low percentages based on phone number sharding. We have a few different versions of files that we think are aesthetically pleasing, and will iterate through those slowly over time. There is no other significance to these files.

That’s just beautiful.

Hey ONS: This Is Not A Mistake

Hi, ONS! I know we haven’t really spoken since you ghosted me in 2011, but I just wanted to clear something up for you –

This is not a mistake (except for the missing last names):

(Specimen) 2021 census form on which Ruth declares that she cohabits with both a husband AND a partner.
It’s perfectly possible for somebody to live with multiple partners, even if they’re forbidden from marrying more than one.

Back in 2011 you thought it was a mistake, and this prevented my partner, her husband and I from filling out the digital version of the census. I’m sure it’s not common for somebody to have multiple cohabiting romantic relationships (though it’s possibly more common than some other things you track…), but surely an “Are you sure?” would be better than a “No you don’t!”

Clippy says "It looks like you've got a husband AND a partner. Is that right?" with possible answers "Yes, and it's awesome." or "No, but I can dream!"
For all I know, you already fixed it. If not: I mocked-up a UI for you.

We worked around it in 2011 by using the paper forms. Apparently this way you still end up “correcting” our relationship status for us (gee, thanks!) but at least – I gather – the originals are retained. So maybe in a more-enlightened time, future statisticians might be able ask about the demographics of domestic nonmonogamy and have at least some data to work with from the early 21st century.

I know you’re keen for as many people as possible to do the census digitally this year. But unless you’ve fixed your forms then my family and I – and thousands of others like us – will either have to use the paper copies you’re trying to phase out… or else knowingly lie on the digital versions. Which would you prefer?

× ×

Santander to Accept Homemade Deeds Poll

For most of the last decade, one of my side projects has been FreeDeedPoll.org.uk, a website that helps British adults to change their name for free and without a solicitor. Here’s a little known fact: as a British citizen, you have the right to be known by virtually any name you like, and for most people the simplest way to change it is to write out a deed poll: basically a one-person contract on which you promise that you’re serious about adopting your new name and you’re not committing fraud or anything.

FreeDeedPoll.org.uk
This web design looked dated when I made it and hasn’t gotten any younger, but the content remains valid as ever.

Over that time, I’ve helped thousands of people to change their names. I don’t know exactly how many because I don’t keep any logs, but I’ve always gotten plenty of email from people about the project. Contact spiked in 2013 after the Guardian ran an article about it, but I still correspond with two or three people in a typical week.

These people have lots of questions that come up time and time again, and if I had more free time I’d maintain an FAQ of them or something. In any case, a common one is people asking for advice when their high street bank, almost invariably either Nationwide or Santander, disputes the legitimacy of a “home made” deed poll and refuses to accept it.

Abbey National and Abbey (former names of Santander) crossed out and replaced with Santander.
You’d think that Santander of all people would appreciate how important it is to have your legitimate change of name respected. Hang on… haven’t I joked about their rebranding before?

When such people contact me, I advise them of a number of solutions and workarounds. Going to a different branch can work (training at these high street banks is internally inconsistent, I guess?). Getting your government-issued identity documents sorted and then threatening to move your account elsewhere can sometimes work. For applicants willing to spend a little money, paying a solicitor a couple of quid to be one of your witnesses can work. I often don’t hear back from people who email me about these banks: maybe they find success by one of these routes, or maybe they give up and go down one an unnecessarily-expensive avenue.

But one thing I always put on the table is the possibility of fighting. I provide a playbook of strategies to try to demonstrate to their troublemaking bank that the bank is in the wrong, along with all of the appropriate legal citations. Recent years put a new tool in the box: the GDPR/DPA2018, which contains clauses prohibiting companies from knowingly retaining incorrect personal data about an individual. I’ve been itching for a chance to use these new weapons… and over this last month, I finally had the opportunity.

A man signs a document.
Print this. Sign here. That’s pretty-much all there is to it.

I was recently contacted by a student (who, as you might expect, has more free time than they do spare money!) who was having trouble with Santander refusing to accept their deed poll. They were willing to go all-out to prove their bank wrong. So I gave them the toolbox and they worked through it and… Santander caved!

Not only have Santander accepted that they were wrong in the case of this student, but they’ve also committed to retraining their staff. Oh, and they’ve paid compensation to the student who emailed me.

Even from my position on the sidelines, I couldn’t help but cheer at this news, and not just because I’ll hopefully have fewer queries to deal with.

× ×

G7 Comes Out in Favor of Encryption Backdoors

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

From a G7 meeting of interior ministers in Paris this month, an “outcome document“:

Encourage Internet companies to establish lawful access solutions for their products and services, including data that is encrypted, for law enforcement and competent authorities to access digital evidence, when it is removed or hosted on IT servers located abroad or encrypted, without imposing any particular technology and while ensuring that assistance requested from internet companies is underpinned by the rule law and due process protection. Some G7 countries highlight the importance of not prohibiting, limiting, or weakening encryption;

There is a weird belief amongst policy makers that hacking an encryption system’s key management system is fundamentally different than hacking the system’s encryption algorithm. The difference is only technical; the effect is the same. Both are ways of weakening encryption.

The G7’s proposal to encourage encryption backdoors demonstrates two unsurprising things about the politicians in attendance, including that:

  • They’re unwilling to attempt to force Internet companies to add backdoors (e.g. via legislation, fines, etc.), making their resolution functionally toothless, and
  • More-importantly: they continue to fail to understand what encryption is and how it works.

Somehow, then, this outcome document simultaneously manages to both go too-far (for a safe and secure cryptographic landscape for everyday users) and not-far-enough (for law enforcement agencies that are in favour of backdoors, despite their huge flaws, to actually gain any benefit). Worst of both worlds, then.

Needless to say, I favour not attempting to weaken encryption, because such measures (a) don’t work against foreign powers, terrorist groups, and hardened criminals and (b) do weaken the personal security of law-abiding citizens and companies (who can then become victims of the former group). “Backdoors”, however phrased, are a terrible idea.

I loved Schneier’s latest book, by the way. You should read it.

Mark Zuckerberg asks governments to help control internet content

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

Mark Zuckerberg

Mark Zuckerberg says regulators and governments should play a more active role in controlling internet content.

In an op-ed published in the Washington Post, Facebook’s chief says the responsibility for monitoring harmful content is too great for firms alone.

He calls for new laws in four areas: “Harmful content, election integrity, privacy and data portability.”

It comes two weeks after a gunman used the site to livestream his attack on a mosque in Christchurch, New Zealand.

“Lawmakers often tell me we have too much power over speech, and frankly I agree,” Mr Zuckerberg writes, adding that Facebook was “creating an independent body so people can appeal our decisions” about what is posted and what is taken down.

An interesting move which puts Zuckerberg in a parallel position to Bruce Schneier, who’s recently (and especially in his latest book) stood in opposition to a significant number of computer security experts (many of whom are of the “crypto-anarchist” school of thought) also pushed for greater regulation on the Internet. My concern with both figureheads’ proposals comes from the inevitable difficulty in enforcing Internet-wide laws: given that many countries simply won’t enact, or won’t effectively enforce, legislation of the types that either Zuckerberg nor Schneier suggest, either (a) companies intending to engage in unethical behaviour will move to – and profit in – those countries, as we already see with identity thieves in Nigeria, hackers in Russia, and patent infringers in China… or else (b) countries that do agree on a common framework will be forced to curtail Internet communications with those countries, leading to a fragmented and ultimately less-free Internet.

Neither option is good, but I still back these proposals in principle. After all: we don’t enact other internationally-relevant laws (like the GDPR, for example) because we expect to achieve 100% compliance across the globe – we do so because they’re the right thing to do to protect individuals and economies from harm. Little by little, Internet legislation in general (possibly ignoring things like the frankly silly EU cookie regulation and parts of the controversial new EU directives on copyright) makes the Internet a safer place for citizens of Western countries. There are still a huge number of foreign threats like scammers and malware authors as as well as domestic lawbreakers, but increasing the accountability of large companies is, at this point, a far bigger concern.

German chat app slacking on hashing fined €20k

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

by Richard Chirgwin (The Register)

German chat platform Knuddels.de (“Cuddles”) has been fined €20,000 for storing user passwords in plain text (no hash at all? Come on, people, it’s 2018).

The data of Knuddels users was copied and published by malefactors in July. In September, someone emailed the company warning them that user data had been published at Pastebin (only 8,000 members affected) and Mega.nz (a much bigger breach). The company duly notified its users and the Baden-Württemberg data protection authority.

Interesting stuff: this German region’s equivalent of the ICO applied a fine to this app for failing to hash passwords, describing them as personal information that was inadequately protected following their theft. That’s interesting because it sets a German, and to a lesser extend a European, precedent that plaintext passwords can be considered personal information and therefore allowing the (significant) weight of the GDPR to be applied to their misuse.

How My Stupid Bloody Name Finally Paid For Itself

Since changing my surname 11½ years ago to the frankly-silly (albeit very “me”) Q, I’ve faced all kinds of problems, from computer systems that don’t accept my name to a mocking from the Passport Office to getting banned from Facebook. I soon learned to work-around systems that insisted that surnames were at least two characters in length. This is a problem which exists mostly because programmers don’t understand how names work in the real world (or titles, for that matter, as I’ve also discovered).

It’s always been a bit of an inconvenience to have to do these things, but it’s never been a terrible burden: even when I fly internationally – which is probably the hardest part of having my name – I’ve learned the tricks I need to minimise how often I’m selected for an excessive amount of unwanted “special treatment”.

Airport
I plan to make my first trip to the USA since my name change, next year. Place bets now on how that’ll go.

This year, though, for the very first time, my (stupid bloody) unusual name paid for itself. And not just in the trivial ways I’m used to, like being able to spot my badge instantly on the registration table at conferences I go to or being able to fill out paper forms way faster than normal people. I mean in a concrete, financially-measurable way. Wanna hear?

So: I’ve a routine of checking my credit report with the major credit reference agencies every few years. I’ve been doing so since long before doing so became free (thanks GDPR); long even before I changed my name: it just feels like good personal data housekeeping, and it’s interesting to see what shows up.

Message to Equifax asking them to correct the details on my Credit Report.
It started out with the electoral roll. How did it end up like this? It was only the electoral roll. It was only the electoral roll.

And so I noticed that my credit report with Equifax said that I wasn’t on the electoral roll. Which I clearly am. Given that my credit report’s pretty glowing, I wasn’t too worried, but I thought I’d drop them an email and ask them to get it fixed: after all, sometimes lenders take this kind of thing into account. I wasn’t in any hurry, but then, it seems: neither were they –

  • 2 February 2016 – I originally contacted them
  • 18 February 2016 – they emailed to say that they were looking into it and that it was taking a while
  • 22 February 2016 – they emailed to say that they were still looking into it
  • 13 July 2016 – they emailed to say that they were still looking into it (which was a bit of a surprise, because after so long I’d almost forgotten that I’d even asked)
  • 14 July 2016 – they marked the issue as “closed”… wait, what?
Equifax close my request
Given that all they’d done for six months was email me occasionally to say that it was taking a while, it was a little insulting to then be told they’d solved it.

I wasn’t in a hurry, and 2017 was a bit of a crazy year for me (for Equifax too, as it happens), so I ignored it for a bit, and then picked up the trail right after the GDPR came into force. After all, they were storing personal information about me which was demonstrably incorrect and, continued to store and process it even after they’d been told that it was incorrect (it’d have been a violation of principle 4 of the DPA 1998, too, but the GDPR‘s got bigger teeth: if you’re going to sick the law on somebody, it’s better that it has bark and bite).

My message instructing Equifax to fix their damn data about me.
Throwing the book tip-of-the-day: don’t threaten, just explain what you require and under what legal basis you’re able to do so. Let lawyers do the tough stuff.

My anticipation was that my message of 13 July 2018 would get them to sit up and fix the issue. I’d assumed that it was probably related to my unusual name and that bugs in their software were preventing them from joining-the-dots between my credit report and the Electoral Roll. I’d also assumed that this nudge would have them either fix their software… or failing that, manually fix my data: that can’t be too hard, can it?

Apparently it can:

Equifax suggest that I change my name ON THE ELECTORAL ROLL to match my credit report, rather than the other way around.
You want me to make it my problem, Equifax, and you want me to change my name on the Electoral Roll to match the incorrect name you use to refer to me in your systems?

Equifax’s suggested solution to the problem on my credit report? Change my name on the Electoral Roll to match the (incorrect) name they store in their systems (to work around a limitation that prevents them from entering single-character surnames)!

At this point, they turned my send-a-complaint-once-every-few-years project into a a full blown rage. It’s one thing if you need me to be understanding of the time it can take to fix the problems in your computer systems – I routinely develop software for large and bureaucratic organisations, I know the drill! – but telling me that your bugs are my problems and telling me that I should lie to the government to work around them definitely isn’t okay.

Actually, Equifax: no. No no no no no. No.
Dear Equifax: No. No no no. No. Also, no. Now try again. Love Dan.

At this point, I was still expecting them to just fix the problem: if not the underlying technical issue then instead just hack a correction into my report. But clearly they considered this, worked out what it’d cost them to do so, and decided that it was probably cheaper to negotiate with me to pay me to go away.

Which it was.

This week, I accepted a three-figure sum from Equifax as compensation for the inconvenience of the problem with my credit report (which now also has a note of correction, not that my alleged absence from the Electoral Roll has ever caused my otherwise-fine report any trouble in the past anyway). Curiously, they didn’t attach any strings to the deal, such as not courting publicity, so it’s perfectly okay for me to tell you about the experience. Maybe you know somebody who’s similarly afflicted: that their “unusual” name means that a credit reference company can’t accurately report on all of their data. If so, perhaps you’d like to suggest that they take a look at their credit report too… just saying.

Cash!
You can pay for me to go away, but it takes more for me to shut up. (A lesson my parents learned early on.)

Apparently Equifax think it’s cheaper to pay each individual they annoy than it is to fix their database problems. I’ll bet that, in the long run, that isn’t true. But in the meantime, if they want to fund my recent trip to Cornwall, that’s fine by me.

× × × × × × ×

After Section 702 Reauthorization

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

After Section 702 Reauthorization – Schneier on Security (schneier.com)

For over a decade, civil libertarians have been fighting government mass surveillance of innocent Americans over the Internet. We’ve just lost an important battle. On January 18, President Trump signed the renewal of Section 702, domestic mass surveillance became effectively a permanent part of US law. Section 702 was initially passed in 2008, as an…

When One Library Steals From Another

When I first started working at the Bodleian Libraries in 2011, their websites were looking… a little dated. I’d soon spend some time working with a vendor (whose premises mysteriously caught fire while I was there, freeing me up to spend my birthday in a bar) to develop a fresh, modern interface for our websites that, while not the be-all and end-all, was a huge leap forwards and has served us well for the last five years or so.

The Bodleian Libraries website as it appeared in 2011.
The colour scheme, the layout, the fact that it didn’t remotely work on mobiles… there was a lot wrong with the old design of the Bodleian Libraries’ websites.

Fast-forward a little: in about 2015 we noticed a few strange anomalies in our Google Analytics data. For some reason, web addresses were appearing that didn’t exist anywhere on our site! Most of these resulted from web visitors in Turkey, so we figured that some Turkish website had probably accidentally put our Google Analytics user ID number into their code rather than their own. We filtered out the erroneous data – there wasn’t much of it; the other website was clearly significantly less-popular than ours – and carried on. Sometimes we’d speculate about the identity of the other site, but mostly we didn’t even think about it.

Bodleian Library & Radcliffe Camera website
How a Bodleian Libraries’ website might appear today. Pay attention, now: there’ll be a spot-the-difference competition in a moment.

Earlier this year, there was a spike in the volume of the traffic we were having to filter-out, so I took the time to investigate more-thoroughly. I determined that the offending website belonged to the Library of Bilkent University, Turkey. I figured that some junior web developer there must have copy-pasted the Bodleian’s Google Analytics code and forgotten to change the user ID, so I went to the website to take a look… but I was in for an even bigger surprise.

Bilkent University Library website, as it appears today.
Hey, that looks… basically identical!

Whoah! The web design of a British university was completely ripped-off by a Turkish university! Mouth agape at the audacity, I clicked my way through several of their pages to try to understand what had happened. It seemed inconceivable that it could be a coincidence, but perhaps it was supposed to be more of an homage than a copy-paste job? Or perhaps they were ripped-off by an unscrupulous web designer? Or maybe it was somebody on the “inside”, like our vendor, acting unethically by re-selling the same custom design? I didn’t believe it could be any of those things, but I had to be sure. So I started digging…

Bodleian and Bilkent search boxes, side-by-side.
Our user research did indicate that putting the site and catalogue search tools like this was smart. Maybe they did the same research?

 

Bodleian and Bilkent menus side-by-side.
Menus are pretty common on many websites. They probably just had a similar idea.

 

Bodleian and Bilkent opening hours, side-by-side.
Tabs are a great way to show opening hours. Everybody knows that. And this is obviously just the a popular font.

 

Bodleian and Bilkent sliders, side-by-side.
Oh, you’ve got a slider too. With circles? And you’ve got an identical Javascript bug? Okay… now that’s a bit of a coincidence…

 

Bodleian and Bilkent content boxes, side-by-side.
Okay, I’m getting a mite suspicious now. Surely we didn’t independently come up with this particular bit of design?

 

Bodleian and Bilkent footers, side-by-side.
Well these are clearly different. Ours has a copyright notice, for example…

 

Copyright notice on Bilkent University Library's website.
Oh, you DO have a copyright notice. Hang on, wait: you’ve not only stolen our design but you’ve declared it to be open-source???

I was almost flattered as I played this spot-the-difference competition, until I saw the copyright notice: stealing our design was galling enough, but then relicensing it in such a way that they specifically encourage others to steal it too was another step entirely. Remember that we’re talking about an academic library, here: if anybody ought to have a handle on copyright law then it’s a library!

I took a dive into the source code to see if this really was, as it appeared to be, a copy-paste-and-change-the-name job (rather than “merely” a rip-off of the entire graphic design), and, sure enough…

HTML source code from Bilkent University Library.
In their HTML source code, you can see both the Bodleian’s Google Analytics code (which they failed to remove) but also their own. And a data- attribute related to a project I wrote and that means nothing to their site.

It looks like they’d just mirrored the site and done a search-and-replace for “Bodleian”, replacing it with “Bilkent”. Even the code’s spelling errors, comments, and indentation were intact. The CSS was especially telling (as well as being chock-full of redundant code relating to things that appear on our website but not on theirs)…

CSS code from Bilkent University.
The search-replace resulted in some icky grammar, like “the Bilkent” appearing in their code. And what’s this? That’s MY NAME in the middle of their source code!

So I reached out to them with a tweet:

Tweet: Hey @KutphaneBilkent (Bilkent University Library): couldn't help but notice your website looks suspiciously like those of @bodleianlibs...?
My first tweet to Bilkent University Library contained a “spot the difference” competition.

I didn’t get any response, although I did attract a handful of Turkish followers on Twitter. Later, they changed their Twitter handle and I thought I’d take advantage of the then-new capability for longer tweets to have another go at getting their attention:

Tweet: I see you've changed your Twitter handle, @librarybilkent! Your site still looks like you've #stolen the #webdesign from @bodleianlibs, though (and changed the license to a #CreativeCommons one, although the fact you forgot to change the #GoogleAnalytics ID is a giveaway...).
This time, I was a little less-sarcastic and a little more-aggressive. Turns out that’s all that was needed.

Clearly this was what it took to make the difference. I received an email from the personal email account of somebody claiming to be Taner Korkmaz, Systems Librarian with Bilkent’s Technical Services team. He wrote (emphasis mine):

Dear Mr. Dan Q,

My name is Taner Korkmaz and I am the systems librarian at Bilkent. I am writing on behalf of Bilkent University Library, regarding your share about Bilkent on your Twitter account.

Firstly, I would like to explain that there is no any relation between your tweet and our library Twitter handle change. The librarian who is Twitter admin at Bilkent did not notice your first tweet. Another librarian took this job and decided to change the twitter handle because of the Turkish letters, abbreviations, English name requirement etc. The first name was @KutphaneBilkent (kutuphane means library in Turkish) which is not clear and not easy to understand. Now, it is @LibraryBilkent.

About 4 years ago, we decided to change our library website, (and therefore) we reviewed the appearance and utility of the web pages.

We appreciated the simplicity and clarity of the user interface of University of Oxford Bodlien Library & Radcliffe Camera, as an academic pioneer in many fields. As a not profit institution, we took advantage of your template by using CSS and HTML, and added our own original content.

We thought it would not create a problem the idea of using CSS codes since on the web page there isn’t any license notice or any restriction related to the content of the template, and since the licenses on the web pages are mainly more about content rather than templates.

The Library has its own Google Analytics and Search Console accounts and the related integrations for the web site statistical data tracking. We would like to point out that there is a misunderstanding regarding this issue.

In 2017, we started to work on creating a new web page and we will renew our current web page very soon.

Thank you in advance for your attention to this matter and apologies for possible inconveniences.

Yours sincerely,

Or to put it another way: they decided that our copyright notice only applied to our content and not our design and took a copy of the latter.

Do you remember when I pointed out earlier that librarians should be expected to know their way around copyright law? Sigh.

They’ve now started removing evidence of their copy-pasting such as the duplicate Google Analytics code fragment and the references to LibraryData, but you can still find the unmodified code via archive.org, if you like.

That probably ends my part in this little adventure, but I’ve passed everything on to the University of Oxford’s legal team in case any of them have anything to say about it. And now I’ve got a new story to tell where web developers get together over a pint: the story of the time that I made a website for a university… and a different university stole it!

× × × × × × × × × × × ×

Data-hucksters beware: online privacy is returning

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

Next year, 25 May looks like being a significant date. That’s because it’s the day that the European Union’s general data protection regulation (GDPR) comes into force. This may not seem like a big deal to you, but it’s a date that is already keeping many corporate executives awake at night. And for those who are still sleeping soundly, perhaps it would be worth checking that their organisations are ready for what’s coming down the line.

First things first. Unlike much of the legislation that emerges from Brussels, the GDPR is a regulation rather than a directive. This means that it becomes law in all EU countries at the same time; a directive, in contrast, allows each country to decide how its requirements are to be incorporated in national laws…