For the past 9 months I have been presenting versions of this talk to AI researchers, investors, politicians and policy makers. I felt it was time to share these ideas with a wider audience. Thanks to the Ditchley conference on Machine Learning in 2017 for giving me a fantastic platform to get early…
Summary: The central prediction I want to make and defend in this post is that continued rapid progress in machine learning will drive the emergence of a new kind of geopolitics; I have been calling it AI Nationalism. Machine learning is an omni-use technology that will come to touch all sectors and parts of society. The transformation of both the economy and the military by machine learning will create instability at the national and international level forcing governments to act. AI policy will become the single most important area of government policy. An accelerated arms race will emerge between key countries and we will see increased protectionist state action to support national champions, block takeovers by foreign firms and attract talent. I use the example of Google, DeepMind and the UK as a specific example of this issue. This arms race will potentially speed up the pace of AI development and shorten the timescale for getting to AGI. Although there will be many common aspects to this techno-nationalist agenda, there will also be important state specific policies. There is a difference between predicting that something will happen and believing this is a good thing. Nationalism is a dangerous path, particular when the international order and international norms will be in flux as a result and in the concluding section I discuss how a period of AI Nationalism might transition to one of global cooperation where AI is treated as a global public good.
Excellent inspiring and occasionally scary look at the impact that the quest for general-purpose artificial intelligence has on the international stage. Will we enter an age of “AI Nationalism”? If so, how will we find out way to the other side? Excellent longread.
Underappreciated video presentation on how only a few small changes to the timeline of the Internet and early Web results in a completely different set of technologies and companies becoming dominant today. Thought-provoking.
My home computer in 1998 had a 56K modem connected to our telephone line; we were allowed a maximum of thirty minutes of computer usage a day, because my parents — quite reasonably — did not want to have their telephone shut off for an evening at a time. I remember webpages loading slowly: ten […]
My home computer in 1998 had a 56K modem connected to our telephone line; we were allowed a maximum of thirty minutes of computer usage a day, because my parents — quite reasonably — did not want to have their telephone shut off for an evening at a time. I remember webpages loading slowly: ten to twenty seconds for a basic news article.
At the time, a few of my friends were getting cable internet. It was remarkable seeing the same pages load in just a few seconds, and I remember thinking about the kinds of the possibilities that would open up as the web kept getting faster.
And faster it got, of course. When I moved into my own apartment several years ago, I got to pick my plan and chose a massive fifty megabit per second broadband connection, which I have since upgraded.
But first, a short parenthetical: I’ve been writing posts in both long- and short-form about this stuff for a while, but I wanted to bring many threads together into a single document that may pretentiously be described as a theory of or, more practically, a guide to the bullshit web.
A second parenthetical: when I use the word “bullshit” in this article, it isn’t in a profane sense. It is much closer to Harry Frankfurt’s definition in “On Bullshit”:
It is just this lack of connection to a concern with truth — this indifference to how things really are — that I regard as of the essence of bullshit.
In the year 1930, John Maynard Keynes predicted that, by century’s end, technology would have advanced sufficiently that countries like Great Britain or the United States would have achieved a 15-hour work week. There’s every reason to believe he was right. In technological terms, we are quite capable of this. And yet it didn’t happen. Instead, technology has been marshaled, if anything, to figure out ways to make us all work more. In order to achieve this, jobs have had to be created that are, effectively, pointless. Huge swathes of people, in Europe and North America in particular, spend their entire working lives performing tasks they secretly believe do not really need to be performed. The moral and spiritual damage that comes from this situation is profound. It is a scar across our collective soul. Yet virtually no one talks about it.
These are what I propose to call ‘bullshit jobs’.
What is the equivalent on the web, then?
This, this, a thousand times this. As somebody who’s watched the Web grow both in complexity and delivery speed over the last quarter century, it apalls me that somewhere along the way complexity has started to win. I don’t want to have to download two dozen stylesheets and scripts before your page begins to render – doubly-so if those additional files serve no purpose, or at least no purpose discernable to the reader. Personally, the combination of uMatrix and Ghostery is all the adblocker I need (and I’m more-than-willing to add a little userscript to “fix” your site if it tries to sabotage my use of these technologies), but when for whatever reason I turn these plugins off I feel like the Web has taken a step backwards while I wasn’t looking.
I recently discovered a minor security vulnerability in mobile webcomic reading app Comic Chameleon, and I thought that it was interesting (and tame) enough to share as a learning example of (a) how to find security vulnerabilities in an app like this, and (b) more importantly, how to write an app like this without this kind of security vulnerability.
The nature of the vulnerability is that, for webcomics pushed directly into the platform by their authors, it’s possible to read comics (long) before they’re published. By way of proof, here’s a copy of the top-right 200 × 120 pixels of episode 54 of the (excellent) Forward Comic, which Imgur will confirm was uploaded on 2 July 2018: over three months ahead of its planned publication date.
How to hack a web-backed app
Just to be clear, I didn’t set out to hack this app, but once I stumbled upon the vulnerability I wanted to make sure that I was able to collect enough information that I’d be able to explain to its author what was wrong and how to fix it. You’d be amazed how many systems I find security holes in almost-completely by accident. In fact, I’d just noticed that the application supported some webcomics that I follow but for which I hadn’t been able to find RSS feeds (and so I was selfdogfooding my own tool, RSSey, to “produce” RSS feeds for my reader by screen-scraping: not the most-elegant solution). But if this app could produce a list of issues of the comic, it must have some way of doing what I was trying to do, and I wanted to know what it was.
The app, I figured, must “phone home” to some website – probably the app’s official website itself – to get the list of comics that it supports and details of where to get their feeds from, so I grabbed a copy of the app and started investigating. Because I figured I was probably looking for a URL, the first thing I did was to download the raw APK file (your favourite search engine can tell you how to do this), decompressed it (APK files are just ZIP files, really) and ran strings on it to search for likely-looking URLs:
I tried visiting a few of the addresses but many of them seemed to be API endpoints that were expecting additional parameters. Probably, I figured, the strings I’d extracted were prefixes to which those parameters were attached. Rather than fuzz for the right parameters, I decided to watch what the app did: I spun up a simulated Android device using the official emulator (I could have used my own on a wireless network that I control, of course, but this was lazier) and ran my favourite packet sniffer to see what the application was requesting.
Now I had full web addresses with parameters. Comparing the parameters that appeared when I clicked different comics revealed that each comic in the “full list” was assigned a numeric ID which was used when requesting issues of that comic (along with an intermediate stage where the year of publication is requested).
Interestingly, a number of comics were listed with the attribute s="no-show" and did not appear in the app: it looked like comics that weren’t yet being made available via the app were already being indexed and collected by its web component, and for some reason were being exposed via the XML API: presumably the developer had never considered that anybody but their app would look at the XML itself, but the thing about the Web is that if you put it on the Web, anybody can see it.
Still: at this point I assumed that I was about to find what I was looking for – some kind of machine-readable source (an RSS feed or something like one) for a webcomic or two. But when I looked at the XML API for one of those webcomics I discovered quite a bit more than I’d bargained on finding:
The first webcomic I looked at included the “official” web addresses and titles of each published comic… but also several not yet published ones. The unpublished ones were marked with s="no-show" to indicate to the app that they weren’t to be shown, but I could now see them. The “official” web addresses didn’t work for me, as I’d expected, but when I tried Comic Chameleon’s versions of the addresses, I found that I could see entire episodes of comics, up to three and a half months ahead of their expected publication date.
Naturally, I compiled all of my findings into an email and contacted the app developer with all of the details they’d need to fix it – in hacker terms, I’m one of the “good guys”! – but I wanted to share this particular example with you because (a) it’s not a very dangerous leak of data (a few webcomics a few weeks early and/or a way to evade a few ads isn’t going to kill anybody) and (b) it’s very illustrative of the kinds of mistakes that app developers are making a lot, these days, and it’s important to understand why so that you’re not among them. On to that in a moment.
Because (I’d like to think) I’m one of the “good guys” in the security world, the first thing I did after the research above was to contact the author of the software. They didn’t seem to have a security.txt file, a disclosure policy, nor a profile on any of the major disclosure management sites, so I sent an email. Were the security issue more-severe, I’d have sent a preliminary email suggesting (and agreeing on a mechanism for) encrypted email, but given the low impact of this particular issue, I just explained the entire issue in the initial email: basically what you’ve read above, plus some tips on fixing the issue and an offer to help out.
I subscribe to the doctrine of responsible disclosure, which – in the event of more-significant vulnerabilities – means that after first contacting the developer of an insecure system and giving them time to fix it, it’s acceptable (in fact: in significant cases, it’s socially-responsible) to publish the details of the vulnerability. In this case, though, I think the whole experience makes an interesting learning example about ways in which you might begin to “black box” test an app for data leaks like this and – below – how to think about software development in a way that limits the risk of such vulnerabilities appearing in the first place.
The author of this software hasn’t given any answer to any of the emails I’ve sent over the last couple of weeks, so I’m assuming that they just plan to leave this particular leak in place. I reached out and contacted the author of Forward Comic, though, which turns out (coincidentally) to be probably the most-severely affected publication on the platform, so that he had the option of taking action before I published this blog post.
Lessons to learn
When developing an “app” (whether for the web or a desktop or mobile platform) that connects to an Internet service to collect data, here are the important things you really, really ought to do:
Don’t publish any data that you don’t want the user to see.
If the data isn’t for everybody, remember to authenticate the user.
And for heaven’s sake use SSL, it’s not the 1990s any more.
That first lesson’s the big one of course: if you don’t want something to be on the public Internet, don’t put it on the public Internet! The feeds I found simply shouldn’t have contained the “secret” information that they did, and the unpublished comics shouldn’t have been online at real web addresses. But aside from (or in addition to) not including these unpublished items in the data feeds, what else might our app developer have considered?
Encryption. There’s no excuse for not using HTTPS these days. This alone wouldn’t have prevented a deliberate effort to read the secret data, but it would help prevent it from happening accidentally (which is a risk right now), e.g. on a proxy server or while debugging something else on the same network link. It also protects the user from exposing their browsing habits (do you want everybody at that coffee shop to know what weird comics you read?) and from having content ‘injected’ (do you want the person at the next table of the coffee shop to be able to choose what you see when you ask for a comic?
Authentication (app). The app could work harder to prove that it’s genuinely the app when it contacts the website. No mechanism for doing this can ever be perfect, because the user hasa access to the app and can theoretically reverse-engineer it to fish the entire authentication strategy out of it, but some approaches are better than others. Sending a password (e.g. over Basic Authentication) is barely better than just using a complex web address, but using a client-side certiciate or an OTP algorithm would (in conjunction with encryption) foil many attackers.
Authentication (user). It’s a very-different model to the one currently used by the app, but requiring users to “sign up” to the service would reduce the risks and provide better mechanisms for tracking/blocking misusers, though the relative anonymity of the Internet doesn’t give this much strength and introduces various additional burdens both technical and legal upon the developer.
Fundamentally, of course, there’s nothing that an app developer can do to perfectly protect the data that is published to that app, because the app runs on a device that the user controls! That’s why the first lesson is the most important: if it shouldn’t be on the public Internet (yet), don’t put it on the public Internet.
Hopefully there’s a lesson for you somewhere too: about how to think about app security so that you don’t make a similar mistake, or about some of the ways in which you might test the security of an application (for example, as part of an internal audit), or, if nothing else, that you should go and read Forward, because it’s pretty cool.
Today, just about all monitors and screens are digital (typically using an LCD or Plasma technology), but a decade or two ago, computer displays were based on the analog technology inherited from TV sets.
These analog displays were constructed around Cathode Rays Tubes (commonly referred to as CRTs).
Analog TV has a fascinating history from when broadcasts were first started (in Black and White), through to the adoption of color TV (using a totally backwards-compatible system with the earlier monochrome standard), through to cable, and now digital.
Analog TV transmissions and their display technology really were clever inventions (and the addition of colour is another inspiring innovation). It’s worth taking a look about how these devices work, and how they were designed, using the technology of the day.
After a couple of false starts, an analog colour TV system, that was backwards compatible with black and white, became standard in 1953, and remained unchanged until the take-over by digital TV broadcasts in the early 2000’s.
I’ve come to believe that the goal of any good framework should be to make itself unnecessary.
Brian said it explicitly of his PhoneGap project:
The ultimate purpose of PhoneGap is to cease to exist.
That makes total sense, especially if your code is a polyfill—those solutions are temporary by d…
As well as publishers creating AMP versions of their pages in order to appease Google, perhaps they will start to ask “Why can’t our regular pages be this fast?” By showing that there is life beyond big bloated invasive web pages, perhaps the AMP project will work as a demo of what the whole web could be.
Alas, as time has passed, that hope shows no signs of being fulfilled. If anything, I’ve noticed publishers using the existence of their AMP pages as a justification for just letting their “regular” pages put on weight.
In fact, AMP’s evolution has made it a viable solution to build entire websites.
On an episode of the Dev Mode podcast a while back, AMP was a hotly-debated topic. But even those defending AMP were doing so on the understanding that it was more a proof-of-concept than a long-term solution (and also that AMP is just for news stories—something else that Google are keen to change).
But now it’s clear that the Google AMP Project is being marketed more like a framework for the future: a collection of web components that prioritise performance
You all know my feelings on AMP already, I’m sure. As Jeremy points out, our optimistic ideas that these problems might go away as AMP “made itself redundant” are turning out not to be true, and Google continues to abuse its monopoly on search to push its walled-garden further into the mainstream. Read his full article…
Later, he named such a document an hashquine, which seems to be appropriate: in computing, a quine is a program that prints its own source code when run.
Now, creating a program that prints its own hash is not that difficult, as several ways can be used to retrieve its source code before computing the hash (the second method does not work for compiled programs):
Reading its source or compiled code (e.g. from disk);
Using the same technique as in a quine to get the source code.
However, conventional documents such as images are likely not to be Turing-complete, so computing their hash is not possible directly. Instead, it is possible to leverage hash collisions to perform the trick.
We say that MD5 is obsolete because one of the properties of a cryptographic hash function is that it should not be possible to find two messages with the same hash.
Today, two practical attacks can be performed on MD5:
Given a prefix P, find two messages M1 and M2 such as md5(P || M1) and md5(P || M2) are equal (|| denotes concatenation);
Given two prefixes P1 and P2, find two messages M1 and M2 such as md5(M1 || P1) and md5(M2 || P2) are equal.
To the best of my knowledge, attack 1 needs a few seconds on a regular computer, whereas attack 2 needs a greater deal of ressources (especially, time). We will use attack 1 in the following.
Please also note that we are not able (yet), given a MD5 hash H, to find a message M such as md5(M) is H. So creating a GIF displaying a fixed MD5 hash and then bruteforcing some bytes to append at the end until the MD5 is the one displayed is not possible.
The GIF file format does not allow to perform arbitrary computations. So we can not ask the software used to display the image to compute the MD5. Instead, we will rely on MD5 collisions.
First, we will create an animated GIF. The first frame is not interesting, since it’s only displaying the background. The second frame will display a 0 at the position of the first character of the hash. The third frame will display a 1 at that same position. And so on and so forth.
In other words, we will have a GIF file that displays all 16 possibles characters for each single character of the MD5 “output”.
If we allow the GIF to loop, it would look like this:
Now, the idea is, for each character, to comment out each frame but the one corresponding to the target hash. Then, if we don’t allow the GIF to loop, it will end displaying the target MD5 hash, which is what we want.
To do so, we will, for each possible character of the MD5 hash, generate a MD5 collision at some place in the GIF. That’s 16×32=512 collisions to be generated, but we average 3.5 seconds per collision on our computer so it should run under 30 minutes.
Once this is done, we will have a valid GIF file. We can compute its hash: it will not change from that point.
Now that we have the hash, for each possible character of the MD5 hash, we will chose one or the other collision “block” previously computed. In one case, the character will be displayed, on the other it will be commented out. Because we replace some part of the GIF file with the specific collision “block” previously computed at that very same place, the MD5 hash of the GIF file will not change.
All what is left to do is to figure out how to insert the collision “blocks” in the GIF file (they look mostly random), so that:
It is a valid GIF file;
Using one “block” displays the corresponding character at the right position, but using the other “block” will not display it.
I will detail the process for one character.
Example for one character
Let’s look at the part of the generated GIF file responsible for displaying (or not) the character 7 at the first position of the MD5 hash.
The figure below shows the relevant hexdump displaying side by side the two possible choices for the collision block (click to display in full size):
The collision “block” is displayed in bold (from 0x1b00 to 0x1b80), with the changing bytes written in red.
In the GIF file formats, comments are defined as followed:
They start with the two bytes 21fe (written in white over dark green background);
Then, an arbitrary number of sub-blocks are present;
The first byte (in black over a dark green background) describes the length of the sub-block data;
Then the sub-block data (in black over a light green background);
When a sub-block of size 0 is reached, it is the end of the comment.
The other colours in the image above represent other GIF blocks:
In purple, the graphics control extension, starting a frame and specifying the duration of the frame;
In light blue, the image descriptor, specifying the size and position of the frame;
In various shades of red, the image data (just as for comments, it can be composed of sub-blocks).
To create this part of the GIF, I considered the following:
The collision “block” should start at a multiple of 64 bytes from the beginning of the file, so I use comments to pad accordingly.
The fastcoll software generating a MD5 collision seems to always create two outputs where the bytes in position 123 are different. As a result, I end the comment sub-block just before that position, so that this byte gives the size of the next comment sub-block.
For one chosen collision “block” (on the left), the byte in position 123 starts a new comment sub-block that skips over the GIF frame of the character, up to the start of a new comment sub-block which is used as padding to align the next collision “block”.
For the other chosen collision “block” (on the right), the byte in position 123 creates a new comment sub-block which is shorter in that case. Following it, I end the comment, add the frame displaying the character of the MD5 hash at the right position, and finally start a new comment up to the comment sub-block used as padding for the next collision “block”.
All things considered, it is not that difficult, but many things must be considered at the same time so it is not easy to explain. I hope that the image above with the various colours helps to understand.
Once all this has been done, we have a proper GIF displaying its own MD5 hash! It is composed of one frame for the background, plus 32 frames for each character of the MD5 hash.
To speed-up the displaying of the hash, we can add to the process a little bit of bruteforcing so that some characters of the hash will be the one we want.
I fixed 6 characters, which does not add much computations to create the GIF. Feel free to add more if needed.
Of course, the initial image (the background) should have those fixed characters in it. I chose the characters d5 and dead as shown in the image below, so that this speed-up is not obvious!
That makes a total of 28 frames. At 20ms per frame, displaying the hash takes a little over half a second.
Analysis of a GIF MD5 hashquine
Since I found out that an other GIF MD5 hashquine has been created before mine once I finished creating one, I thought it may be interesting to compare the two independent creations.
The first noticeable thing is that 7-digits displays have been used. This is an interesting trade-off:
On the plus side, this means that only 7×32=224 MD5 collisions are needed (instead of 16×32=512), which should make the generation of the GIF more than twice as fast, and the image size smaller (84Ko versus 152Ko, but I also chose to feature my avatar and some text).
However, there is a total of 68 GIF frames instead of 28, so the GIF takes more time to load: 1.34 seconds versus 0.54 seconds.
Now, as you can see when loading the GIF file, a hash of 32 8 characters is first displayed, then each segment needed to be turned off is hidden. This is done by displaying a black square on top. Indeed, if we paint the background white, the final image looks like this:
My guess is that it was easier to do so, because there was no need to handle all 16 possible characters. Instead, only a black square was needed.
Also, the size (in bytes) of the black square (42 bytes) is smaller than my characters (58 to 84 bytes), meaning that it is more likely to fit. Indeed, I needed to consider the case in my code where I don’t have enough space and need to generate an other collision.
Other than that, the method is almost identical: the only difference I noticed is that spq used two sub-block comments or collision alignment and skipping over the collision bytes, whereas I used only one.
For reference, here is an example of a black square skipped over:
And here is another black square that is displayed in the GIF:
Hashquines are fun! Many thanks to Ange Albertini for the challenge, you made me dive into the GIF file format, which I probably wouldn’t have done otherwise.
And of course, well done to spq for creating the first known GIF MD5 hashquine!
There’s a story that young network engineers are sometimes told to help them understand network stacks and/or the OSI model, and it goes something like this:
You overhear a conversation between two scientists on the subject of some topic relevant to thier field of interest. But as you listen more-closely, you realise that the scientists aren’t in the same place at all but are talking to one another over the telephone (presumably on speakerphone, given that you can hear them both, I guess). As you pay more attention still, you realise that it isn’t the scientists on the phone call at all but their translators: each scientist speaks to their translator in the scientist’s own language, and the translators are translating what they say into a neutral language shared with the other translator who translate it into the language spoken by the other scientist. Ultimately, the two scientists are communicating with one another, but they’re doing so via a “stack” at their end which only needs to be conceptually the same as the “stack” at the other end as far up as the step-below-them (the “first link” in their communication, with the translator). Below this point, they’re entrusting the lower protocols (the languages, the telephone system, etc.), in which they have no interest, to handle the nitty-gritty on their behalf.
This kind of delegation to shared intermediary protocols is common in networking and telecommunications. The reason relates to opportunity cost, or – for those of you who are Discworld fans – the Sam Vimes’ “Boots” Theory. Obviously an efficiency could be gained here if all scientists learned a lingua franca, a universal shared second language for their purposes… but most-often, we’re looking for a short-term solution to solve a problem today, and the short-term solution is to find a work-around that fits with what we’ve already got: in the case above, that’s translators who share a common language. For any given pair of people communicating, it’s more-efficient to use a translator, even though solving the global problem might be better accomplished by a universal second language (perhaps Esperanto, for valid if Eurocentric reasons!).
The phenomenon isn’t limited to communications, though. Consider self-driving cars. If you look back to autonomous vehicle designs of the 1950s (because yes, we’ve been talking about how cool self-driving cars would be for a long, long time), they’re distinctly different from the ideas we see today. Futurism of the 1950s focussed on adapting the roads themselves to make them more-suitable for self-driving vehicles, typically by implanting magnets or electronics into the road surface itself or by installing radio beacons alongside highways to allow the car to understand its position and surroundings. The modern approach, on the other hand, sees self-driving cars use LiDAR and/or digital cameras to survey their surroundings and complex computer hardware to interpret the data.
This difference isn’t just a matter of the available technology (although technological developments cetainly inspired the new approach): it’s a fundamentally-different outlook! Early proposals for self-driving cars aimed to overhaul the infrastructure of the road network: a “big solution” on the scale of teaching everybody a shared second language. But nowadays we instead say “let’s leave the roads as they are and teach cars to understand them in the same way that people do.” The “big solution” is too big, too hard, and asking everybody to chip in a little towards outfitting every road with a standardised machine-readable marking is a harder idea to swallow than just asking each person who wants to become an early adopter of self-driving technology to pay a lot to implement a more-complex solution that works on the roads we already have.
This week, Google showed off Duplex, a technology that they claim can perform the same kind of delegated-integration for our existing telephone lives. Let’s ignore for a moment the fact that this is clearly going to be overhyped and focus on the theoretical potential of this technology, which (even if it’s not truly possible today) is probably inevitable as chatbot technology improves: what does this mean for us? Instead of calling up the hairdresser to make an appointment, Google claim, you’ll be able to ask Google Assistant to do it for you. The robot will call the hairdresser and make an appointment on your behalf, presumably being mindful of your availability (which it knows, thanks to your calendar) and travel distance. Effectively, Google Assistant becomes your personal concierge, making all of those boring phone calls so that you don’t have to. Personally, I’d be more than happy to outsource to a computer every time I’ve had to sit in a telephone queue, giving the machine a summary of my query and asking it to start going through a summary of it to the human agent at the other end while I make my way back to the phone. There are obviously ethical considerations here too: I don’t like being hounded by robot callers and so I wouldn’t want to inflict that upon service providers… and I genuinely don’t know if it’s better or worse if they can’t tell whether they’re talking to a machine or not.
But ignoring the technology and the hype and the ethics, there’s still another question that this kind of technology raises for me: what will our society look like when this kind of technology is widely-available? As chatbots become increasingly human-like, smarter, and cheaper, what kinds of ways can we expect to interact with them and with one another? By the time I’m able to ask my digital concierge to order me a pizza (safe in the knowledge that it knows what I like and will ask me if it’s unsure, has my credit card details, and is happy to make decisions about special offers on my behalf where it has a high degree of confidence), we’ll probably already be at a point at which my local takeaway also has a chatbot on-staff, answering queries by Internet and telephone. So in the end, my chatbot will talk to their chatbot… in English… and work it out between the two of them.
Let that sink in for a moment: because we’ve a tendency to solve small problems often rather than big problems rarely and we’ve an affinity for backwards-compatibility, we will probably reach the point within the lifetimes of people alive today that a human might ask a chatbot to call another chatbot: a colossally-inefficient way to exchange information built by installments on that which came before. If you’re still skeptical that the technology could evolve this way, I’d urge you to take a look at how the technologies underpinning the Internet work and you’ll see that this is exactly the kind of evolution we already see in our communications technology: everything gets stacked on top of a popular existing protocol, even if it’s not-quite the right tool for the job, because it makes one fewer problem to solve today.
Hacky solutions on top of hacky solutions work: the most believable thing about Max Headroom’s appearance in Ready Player One (the book, not the film: the latter presumably couldn’t get the rights to the character) as a digital assistant was the versatility of his conversational interface.
By the time we’re talking about a “digital concierge” that knows you better than anyone, there’s no reason that it couldn’t be acting on your behalf in other matters. Perhaps in the future your assistant, imbued with intimate knowledge about your needs and interests and empowered to negotiate on your behalf, will be sent out on virtual “dates” with other people’s assistants! Only if it and the other assistant agree that their owners would probably get along, it’ll suggest that you and the other human meet in the real world. Or you could have your virtual assistant go job-hunting for you, keeping an eye out for positions you might be interested in and applying on your behalf… after contacting the employer to ask the kinds of questions that it anticipates that you’d like to know: about compensation, work/life balance, training and advancement opportunities, or whatever it thinks matter to you.
We quickly find ourselves colliding with ethical questions again, of course: is it okay that those who have access to more-sophisticated digital assistants will have an advantage? Should a robot be required to identify itself as a robot when acting on behalf of a human? I don’t have the answers.
But one thing I think we can say, based on our history of putting hacky solutions atop our existing ways of working and the direction in which digital assistants are headed, is that voice interfaces are going to dominate chatbot development a while… even where the machines end up talking to one another!
While rooting through our attic, Ruth‘s brother Owen just found a mystery cable. It almost certainly belongs to me (virtually all of the cables in the house, especially the unusual ones, do), but this one is a mystery to me.
The more I look at it, the more I feel like I’m slowly going mad, as if the cable is some kind of beast from the Lovecraftian Cable Dimension which mortal minds were not meant to comprehend. It’s got three “ends” and is clearly some kind of signal combining (or separating) cable, but it doesn’t look like anything I’ve ever seen before (and don’t forget, I probably own it).
Every time I look at it I have a new idea of what it could be. Some kind of digital dictophone or radio mic connector? Part of a telephone headset? Weather monitoring hardware? A set of converters between two strange and unusual pieces of hardware? But no matter what I come up with, something doesn’t add up? Why only 6 pins? Why the strange screw-in connector? Why the clicker switch? Why the tie clips? Why “split” the output (let alone have cables of different lengths)?
In case it helps, I’ve made a video of it. You’ll note that I use the word “thingy” more times than might perhaps be justified, but I’ve been puzzling over this one for a while:
Can you help? Can you identify this mystery cable? Prize for the correct answer!
I just can’t get excited about the prospect of building something for any particular operating system, be it desktop or mobile. I think about the potential lifespan of what would be built and end up asking myself “why bother?” If something isn’t on the web—and of the web—I find it hard to get excited about it. I’m somewhat jealous of people who can get equally excited about the web, native, hardware, print …in my mind, if it hasn’t got a URL, it’s missing some vital spark.
I know that this is a problem, but I can’t help it. At the very least, I have enough presence of mind to recognise it as being my problem.
My problem, too. There are worse problems to have.
This technique’s about a decade old, but a lot of people still aren’t using it, and I can’t help but suspect that can only be because they didn’t know about it yet, so let’s revisit:
You have a GMail account, right? Or else Google for Domains? Suppose your email address is email@example.com… did you know that also means that you own:
You have a practically infinite number of GMail addresses. Just put a plus sign (+) after your name but before the @-sign and then type anything you like there, and the email will still reach you. You can also insert as many full stops (.) as you like, anywhere in the first half of your email address, and they’ll still reach you, too. And that’s really, really useful.
When you’re asked to give your email address to a company, don’t give them your email address. Instead, give them a mutated form of your email address that will still work, but that identifies exactly who you gave it to. So for example you might give the email address firstname.lastname@example.org to Amazon, the email address email@example.com to Twitter, and the email address firstname.lastname@example.org to… that other website you have an account on.
Why is this a clever idea? Well, there are a few reasons:
If the company sells your email address to spammers, or hackers steal their database, you’ll know who to blame by the email address they’re sending to. I’ve actually caught out an organisation in this way who were illegally reselling their mailing lists to third parties.
If you start getting unwanted mail from somebody (whether because spammers got the email or because you don’t like what the company is sending to you), you can easily block them. Even if you can’t unsubscribe or just because they make it hard to do so, you can just set up a filter to automatically discard anything that comes to that email address in future.
If you feel like organising your life better, you can set up filters for that, too: it doesn’t matter what address a company sends from, so long as you know what address they’re sending to, so you can easily have filters that e.g. automatically forward copies of the mortgage statement that come to email@example.com to your spouse, or automatically label anything coming to firstname.lastname@example.org with the label “Shopping”.
If you’re signing up just to get a freebie and you don’t trust them not to spam you afterwards, you don’t need to use a throwaway: just receive the goodies from them and them block them at the source.
I know that some people get some of these benefits by maintaining a ‘throwaway’ email address. But it’s far more-convenient to use the email address you already have (you’re already logged-in to it and you use it every day)! And if you ever do want a true ‘throwaway’, you’re generally better using Mailinator: when you’re asked for your email address, just mash the keyboard and then put @mailinator.com on the end, to get e.g. email@example.com. Copy the first half of the email address to the clipboard, and then when you’re done signing up to whatever spammy service it is, just go to mailinator.com and paste into the box to see what they emailed you.
A handful of badly-configured websites won’t accept email addresses with plus signs in them, claiming that they’re invalid (they’re not). Personally, when I come across these I generally just inform the owner of the site of the bug and then take my business elsewhere; that’s how important it is to me to be able to filter my email properly! But another option is to exploit the fact that you can put as many dots in (the first part of) your GMail address as you like. So you could put d…firstname.lastname@example.org in and the email will still reach you, and you can later filter-out emails to that address. I’ll leave it as an exercise for the reader to decide how to encode information about the service you’re signing up to into the pattern and number of dots that you use.
If you’re a web developer and you haven’t come across the Google AMP project yet… then what stone have you been living under? But just in case you have been living under such a stone – or you’re not a web developer – I’ll fill you in. If you believe Google’s elevator pitch, AMP is “…an open-source initiative aiming to make the web better for all… consistently fast, beautiful and high-performing across devices and distribution platforms.”
I believe that AMP is fucking poisonous and that the people who’ve come out against it by saying it’s “controversial” so far don’t go remotely far enough. Let me tell you about why.
Google’s stated plan to favour pages that use AMP creates a publisher’s arms race in which content creators are incentivised to produce content in the (open-source but) Google-controlled AMP format to rank higher in the search results, or at least regain parity, versus their competitors. Ultimately, if everybody supported AMP then – ignoring the speed benefits for mobile users (more on that in a moment) – the only winner is Google. Google, who would then have a walled garden of Facebook-beating proportions around the web. Once Google delivers all of your content, there’s no such thing as a free and open Internet any more.
We need to reject AMP, and we need to reject it hard. Right now, it might be sufficient to stand up to your boss and say “no, implementing AMP on our sites is a bad idea.” But one day, it might mean avoiding the use of AMP entirely (there’ll be browser plugins to help you, don’t worry). And if it means putting up with a slightly-slower mobile web while web developers remain lazy, so be it: that’s a sacrifice I’m willing to make to help keep our web free and open. And I hope you will be, too.
Like others, I’m just hoping that Sir Tim will feel the urge to say something about this development soon.
Maybe it’s because I was at Render Conf at the end of last month or perhaps it’s because Three Rings DevCamp – which always gets me inspired – was earlier this month, but I’ve been particularly excited lately to get the chance to play with some of the more “cutting edge” (or at least, relatively-new) web technologies that are appearing on the horizon. It feels like the Web is having a bit of a renaissance of development, spearheaded by the fact that it’s no longer Microsoft that are holding development back (but increasingly Apple) and, perhaps for the first time, the fact that the W3C are churning out standards “ahead” of where the browser vendors are managing to implement technical features, rather than simply reflecting what’s already happening in the world.
It seems to me that HTML5 may well be the final version of HTML. Rather than making grand new releases to the core technology, we’re now – at last! – in a position where it’s possible to iteratively add new techniques in a resilient, progressive manner. We don’t need “HTML6” to deliver us any particular new feature, because the modern web is more-modular and is capable of having additional features bolted on. We’re in a world where browser detection has been replaced with feature detection, to the extent that you can even do non-hacky feature detection in pure CSS, now, and this (thanks to the nature of the Web as a loosely-coupled, resilient platform) means that it’s genuinely possible to progressively-enhance content and get on board with each hot new technology that comes along, if you want, while still delivering content to users on older browsers.
Some of the things I’ve been playing with recently include:
Only really supported in Chrome, but there’s a great polyfill, the Intersection Observer API is one of those technologies that make you say “why didn’t we have that already?” It’s very simple: all an Intersection Observer does is to provide event hooks for target objects entering or leaving the viewport, without resorting to polling or hacky code on scroll event captures.
I’d briefly played with Service Workers before and indeed we’re adding a Service Worker to the next version of Three Rings, which, in conjunction with a manifest.json and the service’s (ongoing) delivery over HTTPS (over H2, where available, since last year), technically makes it a Progressive Web App… and I’ve been looking for opportunities to make use of Service Workers elsewhere in my work, too… but my first dive in to Web Workers was in introducing one to the next upcoming version of MegaMegaMonitor.
The final of the new technologies I’ve been playing with this month is the Fetch API. I’m not pulling any punches when I say that the Fetch API is exactly what XMLHttpRequests should have been from the very beginning. Understanding them properly has finally given me the confidence to stop using jQuery for the one thing for which I always seemed to have had to depend on it for – that is, simplifying Ajax requests! I mean, look at this elegant code:
I have a vivid, recurring dream. I climb the stairs in my parents’ house to see my old bedroom. In the back corner, I hear a faint humming.
It’s my old computer, still running my 1990s-era bulletin board system (BBS, for short), “The Cave.” I thought I had shut it down ages ago, but it’s been chugging away this whole time without me realizing it—people continued calling my BBS to play games, post messages, and upload files. To my astonishment, it never shut down after all…