Milk and Mail Notifications with Flic 2 Buttons

I’ve been playing with a Flic Hub LR and some Flic 2 buttons. They’re “smart home” buttons, but for me they’ve got a killer selling point: rather than locking you in to any particular cloud provider (although you can do this if you want), you can directly program the hub. This means you can produce smart integrations that run completely within the walls of your house.

Here’s some things I’ve been building:

Prerequisite: Flic Hub to Huginn connection

Screenshot showing the location of the enabled "Hub SDK web access open" setting in the Flic Hub settings page of the Flic app.
Step 1. Enable SDK access. Check!

I run a Huginn instance on our household NAS. If you’ve not come across it before, Huginn is a bit like an open-source IFTTT: it’s got a steep learning curve, but it’s incredibly powerful for automation tasks. The first step, then, was to set up my Flic Hub LR to talk to Huginn.

Screenshot showing the Flic Hub SDK open in Firefox. Three modules are loaded: "IR Recorder", "UDP to IR Blaster", and "The Green", the latter of which is open. "The Green" shows JavaScript code to listen for 'buttonSingleOrDoubleClickOrHold' events then transmits them as HTTP POST requests to a 'webHook' URL.
Checking ‘Restart after crash’ seems to help ensure that the script re-launches after e.g. a power cut. Need the script?

This was pretty simple: all I had to do was switch on “Hub SDK web access open” for the hub using the Flic app, then use the the web SDK to add this script to the hub. Now whenever a button was clicked, double-clicked, or held down, my Huginn installation would receive a webhook ping.

Flow chart showing a Flic 2 button sending a Bluetooth 5 LE message to a Flic Hub LR, which sends a Webook notification to Huginn (depicted as a raven wearing a headset), which sends a message to an unidentified Internet Of Things device, "probably" over HTTPS.
Depending on what you have Huginn do “next”, this kind of set-up works completely independently of “the cloud”. (Your raven can fly into the clouds if you really want.)

For convenience, I have all button-presses sent to the same Webhook, and use Trigger Agents to differentiate between buttons and press-types. This means I can re-use functionality within Huginn, e.g. having both a button press and some other input trigger a particular action.

You’ve Got Mail!

By our front door, we have “in trays” for each of Ruth, JTA and I, as well as one for the bits of Three Rings‘ post that come to our house. Sometimes post sits in the in-trays for a long time because people don’t think to check them, or don’t know that something new’s been added.

I configured Huginn with a Trigger Agent to receive events from my webhook and filter down to just single clicks on specific buttons. The events emitted by these triggers are used to notify in-tray owners.

Annotated screenshot showing a Huginn Trigger Agent called "Flic Button C (Double) Details". Annotations show that: (1) "C" is the button name and that I label my buttons with letters. (2) "Double" is the kind of click I'm filtering for. (3) The event source for the trigger is a webhook called "Flic Buttons" whose URL I gave to my Flic Hub. (4) The event receiver for my Trigger Agent is called "Dan's In-Tray (Double) to Slack", which is a Slack Agent, but could easily be something more-sophisticated. (5) The first filter rule uses path: bdaddr, type: field==value, and a value equal to the MAC address of the button; this filters to events from only the specified button. (6) The second filter rule uses path: isDoubleClick, type: field==value, and value: true; this filters to events of type isDoubleClick only and not of types isSingleClick or isHold.
Once you’ve made three events for your first button, you can copy-paste from then on.

In my case, I’ve got pings being sent to mail recipients via Slack, but I could equally well be integrating to other (or additional) endpoints or even performing some conditional logic: e.g. if it’s during normal waking hours, send a Pushbullet notification to the recipient’s phone, otherwise send a message to an Arduino to turn on an LED strip along the top of the recipient’s in-tray.

I’m keeping it simple for now. I track three kinds of events (click = “post in your in-tray”, double-click = “I’ve cleared my in-tray”, hold = “parcel wouldn’t fit in your in-tray: look elsewhere for it”) and don’t do anything smarter than send notifications. But I think it’d be interesting to e.g. have a counter running so I could get a daily reminder (“There are 4 items in your in-tray.”) if I don’t touch them for a while, or something?

Remember the Milk!

Following the same principle, and with the hope that the Flic buttons are weatherproof enough to work in a covered outdoor area, I’ve fitted one… to the top of the box our milkman delivers our milk into!

Top of a reinforced polystyrene doorstep milk storage box, showing the round-topped handle. A metal file sits atop the box, about to be used to file down the handle.
The handle on the box was almost exactly the right size to stick a Flic button to! But it wasn’t flat enough until I took a file to it.

Most mornings, our milkman arrives by 7am, three times a week. But some mornings he’s later – sometimes as late as 10:30am, in extreme cases. If he comes during the school run the milk often gets forgotten until much later in the day, and with the current weather that puts it at risk of spoiling. Ironically, the box we use to help keep the milk cooler for longer on the doorstep works against us because it makes the freshly-delivered bottles less-visible.

Milk container, with a Flic 2 button attached to the handle of the lid and a laminated notice attached, reading: "Left milk? Press the button on the Milk Minder. It'll remind us to bring in the milk!"
Now that I had the technical infrastructure already in place, honestly the hardest part of this project was matching the font used in Milk & More‘s logo.

I’m yet to see if the milkman will play along and press the button when he drops off the milk, but if he does: we’re set! A second possible bonus is that the kids love doing anything that allows them to press a button at the end of it, so I’m optimistic they’ll be more-willing to add “bring in the milk” to their chore lists if they get to double-click the button to say it’s been done!

Future Plans

I’m still playing with ideas for the next round of buttons. Could I set something up to streamline my work status, so my colleagues know when I’m not to be disturbed, away from my desk, or similar? Is there anything I can do to simplify online tabletop roleplaying games, e.g. by giving myself a desktop “next combat turn” button?

Flic Infared Transceiver on the side of a bookcase, alongside an (only slighter smaller than it) 20p piece, for scale.
My Flic Hub is mounted behind a bookshelf in the living room, with only its infrared transceiver exposed. 20p for scale: we don’t keep a 20p piece stuck to the side of the bookcase all the time.

I’m quite excited by the fact that the Flic Hub can interact with an infrared transceiver, allowing it to control televisions and similar devices: I’d love to be able to use the volume controls on our media centre PC’s keyboard to control our TV’s soundbar: and because the Flic Hub can listen for UDP packets, I’m hopeful that something as simple as AutoHotkey can make this possible.

Or perhaps I could make a “universal remote” for our house, accessible as a mobile web app on our internal Intranet, for those occasions when you can’t even be bothered to stand up to pick up the remote from the other sofa. Or something that switched the TV back to the media centre’s AV input when consoles were powered-down, detected by their network activity? (Right now the TV automatically switches to the consoles when they’re powered-on, but not back again afterwards, and it bugs me!)

It feels like the only limit with these buttons is my imagination, and that’s awesome.

Quickly Solving Jigidi Puzzles

tl;dr? Just want instructions on how to solve Jigidi puzzles really fast with the help of your browser’s dev tools? Skip to that bit.

I don’t enjoy jigsaw puzzles

I enjoy geocaching. I don’t enjoy jigsaw puzzles. So mystery caches that require you to solve an online jigsaw puzzle in order to get the coordinates really don’t do it for me. When I’m geocaching I want to be outdoors exploring, not sitting at my computer gradually dragging pixels around!

A completed 1000-piece "Where's Wally?" jigsaw.
Don’t let anybody use my completion of this 1000-piece jigsaw puzzle over New Year as evidence that I’m lying and actually like jigsaws.

Many of these mystery caches use Jigidi to host these jigsaw puzzles. An earlier version of Jigidi was auto-solvable with a userscript, but the service has continued to be developed and evolve and the current version works quite hard to make it hard for simple scripts to solve. For example, it uses a WebSocket connection to telegraph back to the server how pieces are moved around and connected to one another and the server only releases the secret “you’ve solved it” message after it detects that the pieces have been arranged in the appropriate relative configuration.

A nine-piece jigsaw puzzle with the pieces numbered 1 through 9; only the ninth piece is detached.
I made a simple Jigidi puzzle for demonstration purposes. Do you think you can manage a nine-piece jigsaw?

If there’s one thing I enjoy more than jigsaw puzzles – and as previously established there are about a billion things I enjoy more than jigsaw puzzles – it’s reverse-engineering a computer system to exploit its weaknesses. So I took a dive into Jigidi’s client-side source code. Here’s what it does:

  1. Get from the server the completed image and the dimensions (number of pieces).
  2. Cut the image up into the appropriate number of pieces.
  3. Shuffle the pieces.
  4. Establish a WebSocket connection to keep the server up-to-date with the relative position of the pieces.
  5. Start the game: the player can drag-and-drop pieces and if two adjacent pieces can be connected they lock together. Both pieces have to be mostly-visible (not buried under other pieces), presumably to prevent players from just making a stack and then holding a piece against each edge of it to “fish” for its adjacent partners.
Javascirpt code where the truthiness of this.j affects whether or not the pieces are shuffled.
I spent some time tracing call stacks to find this line… only to discover that it’s one of only four lines to actually contain the word “shuffle” and I could have just searched for it…

Looking at that process, there’s an obvious weak point – the shuffling (point 3) happens client-side, and before the WebSocket sync begins. We could override the shuffling function to lay the pieces out in a grid, but we’d still have to click each of them in turn to trigger the connection. Or we could skip the shuffling entirely and just leave the pieces in their default positions.

An unshuffled stack of pieces from the nine-piece jigsaw. Piece number nine is on top of the stack.
An unshuffled jigsaw appears as a stack, as if each piece from left to right and then top to bottom were placed one at a time into a pile.

And what are the default positions? It’s a stack with the bottom-right jigsaw piece on the top, the piece to the left of it below it, then the piece to the left of that and son on through the first row… then the rightmost piece from the second-to-bottom row, then the piece to the left of that, and so on.

That’s… a pretty convenient order if you want to solve a jigsaw. All you have to do is drag the top piece to the right to join it to the piece below that. Then move those two to the right to join to the piece below them. And so on through the bottom row before moving back – like a typewriter’s carriage return – to collect the second-to-bottom row and so on.

How can I do this?

If you’d like to cheat at Jigidi jigsaws, this approach works as of the time of writing. I used Firefox, but the same basic approach should work with virtually any modern desktop web browser.

  1. Go to a Jigidi jigsaw in your web browser.
  2. Pop up your browser’s developer tools (F12, usually) and switch to the Debugger tab. Open the file game/js/release.js and uncompress it by pressing the {} button, if necessary.
  3. Find the line where the code considers shuffling; right now for me it’s like 3671 and looks like this:
    return this.j ? (V.info('board-data-bytes already exists, no need to send SHUFFLE'), Promise.resolve(this.j)) : new Promise(function (d, e) {

    Javascirpt code where the truthiness of this.j affects whether or not the pieces are shuffled.
    I spent some time tracing call stacks to find this line… only to discover that it’s one of only four lines to actually contain the word “shuffle” and I could have just searched for it…
  4. Set a breakpoint on that line by clicking its line number.
  5. Restart the puzzle by clicking the restart button to the right of the timer. The puzzle will reload but then stop with a “Paused on breakpoint” message. At this point the application is considering whether or not to shuffle the pieces, which normally depends on whether you’ve started the puzzle for the first time or you’re continuing a saved puzzle from where you left off.
    Paused on breakpoint dialog with play button.
  6. In the developer tools, switch to the Console tab.
  7. Type: this.j = true (this ensures that the ternary operation we set the breakpoint on will resolve to the true condition, i.e. not shuffle the pieces).
    this.j = true
  8. Press the play button to continue running the code from the breakpoint. You can now close the developer tools if you like.
  9. Solve the puzzle as described/shown above, by moving the top piece on the stack slightly to the right, repeatedly, and then down and left at the end of each full row.
    Jigsaw being solved by moving down-and-right.

Update 2021-09-22: Abraxas observes that Jigidi have changed their code, possibly in response to this shortcut. Unfortunately for them, while they continue to perform shuffling on the client-side they’ll always be vulnerable to this kind of simple exploit. Their new code seems to be named not release.js but given a version number; right now it’s 14.3.1977. You can still expand it in the same way, and find the shuffling code: right now for me this starts on line 1129:

Put a breakpoint on line 1129. This code gets called twice, so the first time the breakpoint gets hit just hit continue and play on until the second time. The second time it gets hit, move the breakpoint to line 1130 and press continue. Then use the console to enter the code d = a.G and continue. Only one piece of jigsaw will be shuffled; the rest will be arranged in a neat stack like before (I’m sure you can work out where the one piece goes when you get to it).

Getting Twitter Avatars (without the Twitter API)

Among Twitter’s growing list of faults over the years are various examples of its increasing divergence from open Web standards and developer-friendly endpoints. Do you remember when you used to be able to subscribe to somebody’s feed by RSS? When you could see who follows somebody without first logging in? When they were still committed to progressive enhancement and didn’t make your browser download ~5MB of Javascript or else not show any content whatsoever? Feels like a long time ago, now.

Lighthouse Performance score for Twitter's Twitter account page on mobile, scoring 50%.
For one of the most-popular 50 websites in the world, this score is frankly shameful.

But those complaints aside, the thing that bugged me most this week was how much harder they’ve made it to programatically get access to things that are publicly accessible via web pages. Like avatars, for example!

If you’re a human and you want to see the avatar image associated with a given username, you can go to twitter.com/that-username and – after you’ve waited a bit for all of the mandatory JavaScript to download and run (I hope you’re not on a metered connection!) – you’ll see a picture of the user, assuming they’ve uploaded one and not made their profile private. Easy.

If you’re a computer and you want to get the avatar image, it used to be just as easy; just go to twitter.com/api/users/profile_image/that-username and you’d get the image. This was great if you wanted to e.g. show a Facebook-style facepile of images of people who’d retweeted your content.

But then Twitter removed that endpoint and required that computers log in to Twitter, so a clever developer made a service that fetched avatars for you if you went to e.g. twivatar.glitch.com/that-username.

But then Twitter killed that, too. Because despite what they claimed 5½ years ago, Twitter still clearly hates developers.

Dan Q's Twitter profile header showing his avatar image.
You want to that image? Well you’ll need a Twitter account, a developer account, an OAuth token set, a stack of code…

Recently, I needed a one-off program to get the avatars associated with a few dozen Twitter usernames.

First, I tried the easy way: find a service that does the work for me. I’d used avatars.io before but it’s died, presumably because (as I soon discovered) Twitter had made things unnecessarily hard for them.

Second, I started looking at the Twitter API documentation but it took me in the region of 30-60 seconds before I said “fuck that noise” and decided that the set-up overhead in doing things the official way simply wasn’t justified for my simple use case.

So I decided to just screen-scrape around the problem. If a human can just go to the web page and see the image, a computer pretending to be a human can do exactly the same. Let’s do this:

/* Copyright (c) 2021 Dan Q; released under the MIT License. */

const Puppeteer = require('puppeteer');

getAvatar = async (twitterUsername) => {
  const browser = await Puppeteer.launch({args: ['--no-sandbox', '--disable-setuid-sandbox']});
  const page = await browser.newPage();
  await page.goto(`https://twitter.com/${twitterUsername}`);
  await page.waitForSelector('a[href$="/photo"] img[src]');
  const url = await page.evaluate(()=>document.querySelector('a[href$="/photo"] img').src);
  await browser.close();
  console.log(`${twitterUsername}: ${url}`);
};

process.argv.slice(2).forEach( twitterUsername => getAvatar( twitterUsername.toLowerCase() ) );
The code is ludicrously simple. It took less time, energy, and code to write this than to follow Twitter’s “approved” procedure. You can download the code via Gist.

Obviously, using this code would violate Twitter’s terms of use for automation, so… don’t, I guess?

Given that I only needed to run it once, on a finite list of accounts, I maintain that my approach was probably kinder on their servers than just manually going to every page and saving the avatar from it. But if you set up a service that uses this approach then you’ll certainly piss off somebody at Twitter and history shows that they’ll take their displeasure out on you without warning.

$ node get-twitter-avatar.js alexsdutton richove geohashing TailsteakAD LilFierce1 ninjanails
alexsdutton: https://pbs.twimg.com/profile_images/740505937039986688/F9gUV0eK_200x200.jpg
lilfierce1: https://pbs.twimg.com/profile_images/1189417235313561600/AZ2eLjAg_200x200.jpg
richove: https://pbs.twimg.com/profile_images/1576438972/2011_My_picture4_200x200.jpeg
geohashing: https://pbs.twimg.com/profile_images/877137707939581952/POzWWV2d_200x200.jpg
ninjanails: https://pbs.twimg.com/profile_images/1146364466801577985/TvCfb49a_200x200.jpg
tailsteakad: https://pbs.twimg.com/profile_images/1118738807019278337/y5WWkLbF_200x200.jpg
This output shows the avatar URLs of a half a dozen Twitter accounts. It took minutes to write the code and takes seconds to run, but if I’d have done it the “right” way I’d still be unnecessarily wading through Twitter’s sprawling documentation.

But it works. It was fast and easy and I got what I was looking for.

And the moral of the story is: if you make an API and it’s terrible, don’t be surprised if people screen-scape your service instead. (You can’t spell “scraping” without “API”, amirite?)

Exploiting vulnerabilities in Cellebrite UFED and Physical Analyzer from an app’s perspective

This article is a repost promoting content originally published elsewhere. See more things Dan's reposted.

Cellebrite makes software to automate physically extracting and indexing data from mobile devices. They exist within the grey – where enterprise branding joins together with the larcenous to be called “digital intelligence.” Their customer list has included authoritarian regimes in Belarus, Russia, Venezuela, and China; death squads in Bangladesh; military juntas in Myanmar; and those seeking to abuse and oppress in Turkey, UAE, and elsewhere. A few months ago, they announced that they added Signal support to their software.

Their products have often been linked to the persecution of imprisoned journalists and activists around the world, but less has been written about what their software actually does or how it works. Let’s take a closer look. In particular, their software is often associated with bypassing security, so let’s take some time to examine the security of their own software.

Moxie Marlinspike (Signal)

Recently Moxie, co-author of the Signal Protocol, came into possession of a Cellebrite Extraction Device (phone cracking kit used by law enforcement as well as by oppressive regimes who need to clamp down on dissidents) which “fell off a truck” near him. What an amazing coincidence! He went on to report, this week, that he’d partially reverse-engineered the system, discovering copyrighted code from Apple – that’ll go down well! – and, more-interestingly, unpatched vulnerabilities. In a demonstration video, he goes on to show that a carefully crafted file placed on a phone could, if attacked using a Cellebrite device, exploit these vulnerabilities to take over the forensics equipment.

Obviously this is a Bad Thing if you’re depending on that forensics kit! Not only are you now unable to demonstrate that the evidence you’re collecting is complete and accurate, because it potentially isn’t, but you’ve also got to treat your equipment as untrustworthy. This basically makes any evidence you’ve collected inadmissible in many courts.

Moxie goes on to announce a completely unrelated upcoming feature for Signal: a minority of functionally-random installations will create carefully-crafted files on their devices’ filesystem. You know, just to sit there and look pretty. No other reason:

In completely unrelated news, upcoming versions of Signal will be periodically fetching files to place in app storage. These files are never used for anything inside Signal and never interact with Signal software or data, but they look nice, and aesthetics are important in software. Files will only be returned for accounts that have been active installs for some time already, and only probabilistically in low percentages based on phone number sharding. We have a few different versions of files that we think are aesthetically pleasing, and will iterate through those slowly over time. There is no other significance to these files.

That’s just beautiful.

Big List of Naughty Strings

This article is a repost promoting content originally published elsewhere. See more things Dan's reposted.

# Reserved Strings
#
# Strings which may be used elsewhere in code
undefined
undef
null
NULL

then
constructor
\
\\

# Numeric Strings
#
# Strings which can be interpreted as numeric
0
1
1.00
$1.00
1/2
1E2

Max Woolf

Max has produced a list of “naughty strings”: things you might try injecting into your systems along with any fuzz testing you’re doing to check for common errors in escaping, processing, casting, interpreting, parsing, etc. The copy above is heavily truncated: the list is long!

It’s got a lot of the things in it that you’d expect to find: reserved keywords and filenames, unusual or invalid unicode codepoints, tests for the Scunthorpe Problem, and so on. But perhaps my favourite entry is this one, a test for “human injection”:

# Human injection
#
# Strings which may cause human to reinterpret worldview
If you're reading this, you've been in a coma for almost 20 years now. We're trying a new technique. We don't know where this message will end up in your dream, but we hope it works. Please wake up, we miss you.

Beautiful.

Downloading a YouTube Music Playlist for Offline Play

Now that Google Play Music has been replaced by YouTube Music, and inspired by the lampshading the RIAA did recently with youtube-dl, a friend asked me: “So does this mean I could download music from my Google Play Music/YouTube Music playlists?”

A Creative MuVo MP3 player (and FM radio), powered off, on a white surface.
My friend still uses a seriously retro digital music player, rather than his phone, to listen to music. It’s not a Walkman or a Minidisc player, I suppose, but it’s still pretty elderly. But it’s not one of these.

I’m not here to speak about the legality of retaining offline copies of music from streaming services. YouTube Music seems to permit you to do this using their app, but I’ll bet there’s something in their terms and conditions that specifically prohibits doing so any other way. Not least because Google’s arrangement with rights holders probably stipulates that they track how many times tracks are played, and using a different player (like my friend’s portable device) would throw that off.

But what I’m interested in is the feasibility. And in answering that question, in explaining how to work out that it’s feasible.

A "Your likes" playlist in the YouTube Music interface, with 10 songs showing.
The web interface to YouTube Music shows playlists of songs and streaming is just a click away.

Spoiler: I came up with an approach, and it looks like it works. My friend can fill up their Zune or whatever the hell it is with their tunes and bop away. But what I wanted to share with you was the underlying technique I used to develop this approach, because it involves skills that as a web developer I use most weeks. Hold on tight, you might learn something!

youtube-dl can download “playlists” already, but to download a personal playlist requires that you faff about with authentication and it’s a bit of a drag. Just extracting the relevant metadata from the page is probably faster, I figured: plus, it’s a valuable lesson in extracting data from web pages in general.

Here’s what I did:

Step 1. Load all the data

I noticed that YouTube Music playlists “lazy load”, and you have to scroll down to see everything. So I scrolled to the bottom of the page until I reached the end of the playlist: now everything was in the DOM, I could investigate it with my inspector.

Step 2. Find each track’s “row”

Using my browser’s debugger “inspect” tool, I found the highest unique-sounding element that seemed to represent each “row”/track. After a little investigation, it looked like a playlist always consists of a series of <ytmusic-responsive-list-item-renderer> elements wrapped in a <ytmusic-playlist-shelf-renderer>. I tested this by running document.querySelectorAll('ytmusic-playlist-shelf-renderer ytmusic-responsive-list-item-renderer') in my debug console and sure enough, it returned a number of elements equal to the length of the playlist, and hovering over each one in the debugger highlighted a different track in the list.

A browser debugger inspecting a "row" in a YouTube Music playlist. The selected row is "Baba Yeta" by Peter Hollens and Malukah, and has the element name "ytmusic-responsive-list-item-renderer" shown by the debugger.
The web application captured right-clicks, preventing the common right-click-then-inspect-element approach… so I just clicked the “pick an element” button in the debugger.

Step 3. Find the data for each track

I didn’t want to spend much time on this, so I looked for a quick and dirty solution: and there was one right in front of me. Looking at each track, I saw that it contained several <yt-formatted-string> elements (at different depths). The first corresponded to the title, the second to the artist, the third to the album title, and the fourth to the duration.

Better yet, the first contained an <a> element whose href was the URL of the piece of music. Extracting the URL and the text was as simple as a .querySelector('a').href on the first <yt-formatted-string> and a .innerText on the others, respectively, so I ran [...document.querySelectorAll('ytmusic-playlist-shelf-renderer ytmusic-responsive-list-item-renderer')].map(row=>row.querySelectorAll('yt-formatted-string')).map(track=>[track[0].querySelector('a').href, `${track[1].innerText} - ${track[0].innerText}`]) (note the use of [...*] to get an array) to check that I was able to get all the data I needed:

Debug console running on YouTube Music. The output shows an array of 256 items; items 200 through 212 are visible. Each item is an array containing a YouTube Music URL and a string showing the artist and track name separated by a hyphen.
Lots of URLs and the corresponding track names in my friend’s preferred format (me, I like to separate my music into folders by album, but I suppose I’ve got a music player with more than a floppy disk’s worth of space on it).

Step 4. Sanitise the data

We’re not quite good-to-go, because there’s some noise in the data. Sometimes the application’s renderer injects line feeds into the innerText (e.g. when escaping an ampersand). And of course some of these song titles aren’t suitable for use as filenames, if they’ve got e.g. question marks in them. Finally, where there are multiple spaces in a row it’d be good to coalesce them into one. I do some experiments and decide that .replace(/[\r\n]/g, '').replace(/[\\\/:><\*\?]/g, '-').replace(/\s{2,}/g, ' ') does a good job of cleaning up the song titles so they’re suitable for use as filenames.

I probably should have it fix quotes too, but I’ll leave that as an exercise for the reader.

Step 5. Produce youtube-dl commands

Okay: now we’re ready to combine all of that output into commands suitable for running at a terminal. After a quick dig through the documentation, I decide that we needed the following switches:

  • -x to download/extract audio only: it defaults to the highest quality format available, which seems reasomable
  • -o "the filename.%(ext)s" to specify the output filename but accept the format provided by the quality requirement (transcoding to your preferred format is a separate job not described here)
  • --no-playlist to ensure that youtube-dl doesn’t see that we’re coming from a playlist and try to download it all (we have our own requirements of each song’s filename)
  • --download-archive downloaded.txt to log what’s been downloaded already so successive runs don’t re-download and the script is “resumable”

The final resulting code, then, looks like this:

console.log([...document.querySelectorAll('ytmusic-playlist-shelf-renderer ytmusic-responsive-list-item-renderer')].map(row=>row.querySelectorAll('yt-formatted-string')).map(track=>[track[0].querySelector('a').href, `${track[1].innerText} - ${track[0].innerText}`.replace(/[\r\n]/g, '').replace(/[\\\/:><\*\?]/g, '-').replace(/\s{2,}/g, ' ')]).map(trackdata=>`youtube-dl -x "${trackdata[0]}" -o "${trackdata[1]}.%(ext)s" --no-playlist --download-archive downloaded.txt`).join("\n"));

Code running in a debugger and producing a list of youtube-dl commands to download a playlist full of music.
The output isn’t pretty, but it’s suitable for copy-pasting into a terminal or command prompt where it ought to download a whole lot of music for offline play.

This isn’t an approach that most people will ever need: part of the value of services like YouTube Music, Spotify and the like is that you pay a fixed fee to stream whatever you like, wherever you like, obviating the need for a large offline music collection. And people who want to maintain a traditional music collection offline are most-likely to want to do so while supporting the bands they care about, especially as (with DRM-free digital downloads commonplace) it’s never been easier to do so.

But for those minority of people who need to play music from their streaming services offline but don’t have or can’t use a device suitable for doing so on-the-go, this kind of approach works. (Although again: it’s probably not permitted, so be sure to read the rules before you use it in such a way!)

Step 6. Learn something

But more-importantly, the techniques of exploring and writing console Javascript demonstrated are really useful for extracting all kinds of data from web pages (data scraping), writing your own userscripts, and much more. If there’s one lesson to take from this blog post it’s not that you can steal music on the Internet (I’m pretty sure everybody who’s lived on this side of 1999 knows that by now), but that you can manipulate the web pages you see. Once you’re viewing it on your computer, a web page works for you: you don’t have to consume a page in the way that the author expected, and knowing how to extract the underlying information empowers you to choose for yourself a more-streamlined, more-personalised, more-powerful web.

When you browse Instagram and find former Australian Prime Minister Tony Abbott’s passport number

This article is a repost promoting content originally published elsewhere. See more things Dan's reposted.

Everything you see when you use “Inspect Element” was already downloaded to your computer, you just hadn’t asked Chrome to show it to you yet. Just like how the cogs were already in the watch, you just hadn’t opened it up to look.

But let us dispense with frivolous cog talk. Cheap tricks such as “Inspect Element” are used by programmers to try and understand how the website works. This is ultimately futile: Nobody can understand how websites work. Unfortunately, it kinda looks like hacking the first time you see it.

Hilarious longread.

The Perfect Art Heist: Hack the Money, Leave the Painting

This article is a repost promoting content originally published elsewhere. See more things Dan's reposted.

Thieves didn’t even bother with a London art gallery’s Constable landscape—and they still walked away with $3 million.

This comic is perhaps the best way to enjoy this news story, which describes the theft of £2.4 million during an unusual… let’s call it an “art heist”… in 2018. It has many the characteristics of the kind of heist you’re thinking about: the bad guys got the money, and nobody gets to see the art. But there’s a twist: the criminals never came anywhere near the painting.

A View from Hampstead Heath, ca. 1825, by John Constable

This theft was committed entirely in cyberspace: the victim was tricked into wiring the money to pay for the painting into the wrong account. The art buyer claims that he made the payment in good faith, though, and that he’s not culpable because it was the seller’s email that must have been hacked. Until it’s resolved, the painting’s not on display, so not only do the criminals have the cash, the painting isn’t on display.

Anyway; go read the comic if you haven’t already.

Evolving Computer Words: “Hacker”

This is part of a series of posts on computer terminology whose popular meaning – determined by surveying my friends – has significantly diverged from its original/technical one. Read more evolving words…

Anticipatory note: based on the traffic I already get to my blog and the keywords people search for, I imagine that some people will end up here looking to learn “how to become a hacker”. If that’s your goal, you’re probably already asking the wrong question, but I direct you to Eric S. Raymond’s Guide/FAQ on the subject. Good luck.

Few words have seen such mutation of meaning over their lifetimes as the word “silly”. The earliest references, found in Old English, Proto-Germanic, and Old Norse and presumably having an original root even earlier, meant “happy”. By the end of the 12th century it meant “pious”; by the end of the 13th, “pitiable” or “weak”; only by the late 16th coming to mean “foolish”; its evolution continues in the present day.

Right, stop that! It's too silly.
The Monty Python crew were certainly the experts on the contemporary use of the word.

But there’s little so silly as the media-driven evolution of the word “hacker” into something that’s at least a little offensive those of us who probably would be described as hackers. Let’s take a look.

Hacker

What people think it means

Computer criminal with access to either knowledge or tools which are (or should be) illegal.

What it originally meant

Expert, creative computer programmer; often politically inclined towards information transparency, egalitarianism, anti-authoritarianism, anarchy, and/or decentralisation of power.

The Past

The earliest recorded uses of the word “hack” had a meaning that is unchanged to this day: to chop or cut, as you might describe hacking down an unruly bramble. There are clear links between this and the contemporary definition, “to plod away at a repetitive task”. However, it’s less certain how the word came to be associated with the meaning it would come to take on in the computer labs of 1960s university campuses (the earliest references seem to come from around April 1955).

There, the word hacker came to describe computer experts who were developing a culture of:

  • sharing computer resources and code (even to the extent, in extreme cases, breaking into systems to establish more equal opportunity of access),
  • learning everything possible about humankind’s new digital frontiers (hacking to learn, not learning to hack)
  • judging others only by their contributions and not by their claims or credentials, and
  • discovering and advancing the limits of computers: it’s been said that the difference between a non-hacker and a hacker is that a non-hacker asks of a new gadget “what does it do?”, while a hacker asks “what can I make it do?”
Venn-Euler-style diagram showing crackers as a subset of security hackers, who in turn are a subset of hackers. Script kiddies are a group of their own, off to the side where nobody has to talk to them (this is probably for the best).
What the media generally refers to as “hackers” would be more-accurately, within the hacker community, be called crackers; a subset of security hackers, in turn a subset of hackers as a whole. Script kiddies – people who use hacking tools exclusively for mischief without fully understanding what they’re doing – are a separate subset on their own.

It is absolutely possible for hacking, then, to involve no lawbreaking whatsoever. Plenty of hacking involves writing (and sharing) code, reverse-engineering technology and systems you own or to which you have legitimate access, and pushing the boundaries of what’s possible in terms of software, art, and human-computer interaction. Even among hackers with a specific interest in computer security, there’s plenty of scope for the legal pursuit of their interests: penetration testing, security research, defensive security, auditing, vulnerability assessment, developer education… (I didn’t say cyberwarfare because 90% of its application is of questionable legality, but it is of course a big growth area.)

Getty Images search for "Hacker".
Hackers have a serious image problem, and the best way to see it is to search on your favourite stock photo site for “hacker”. If you don’t use a laptop in a darkened room, wearing a hoodie and optionally mask and gloves, you’re not a real hacker. Also, 50% of all text should be green, 40% blue, 10% red.

So what changed? Hackers got famous, and not for the best reasons. A big tipping point came in the early 1980s when hacking group The 414s broke into a number of high-profile computer systems, mostly by using the default password which had never been changed. The six teenagers responsible were arrested by the FBI but few were charged, and those that were were charged only with minor offences. This was at least in part because there weren’t yet solid laws under which to prosecute them but also because they were cooperative, apologetic, and for the most part hadn’t caused any real harm. Mostly they’d just been curious about what they could get access to, and were interested in exploring the systems to which they’d logged-in, and seeing how long they could remain there undetected. These remain common motivations for many hackers to this day.

"Hacker" Dan Q
Hoodie: check. Face-concealing mask: check. Green/blue code: check. Is I a l33t hacker yet?

News media though – after being excited by “hacker” ideas introduced by WarGames – rightly realised that a hacker with the same elementary resources as these teens but with malicious intent could cause significant real-world damage. Bruce Schneier argued last year that the danger of this may be higher today than ever before. The press ran news stories strongly associating the word “hacker” specifically with the focus on the illegal activities in which some hackers engage. The release of Neuromancer the following year, coupled with an increasing awareness of and organisation by hacker groups and a number of arrests on both sides of the Atlantic only fuelled things further. By the end of the decade it was essentially impossible for a layperson to see the word “hacker” in anything other than a negative light. Counter-arguments like The Conscience of a Hacker (Hacker’s Manifesto) didn’t reach remotely the same audiences: and even if they had, the points they made remain hard to sympathise with for those outside of hacker communities.

"Glider" Hacker Emblem
‘Nuff said.

A lack of understanding about what hackers did and what motivated them made them seem mysterious and otherworldly. People came to make the same assumptions about hackers that they do about magicians – that their abilities are the result of being privy to tightly-guarded knowledge rather than years of practice – and this elevated them to a mythical level of threat. By the time that Kevin Mitnick was jailed in the mid-1990s, prosecutors were able to successfully persuade a judge that this “most dangerous hacker in the world” must be kept in solitary confinement and with no access to telephones to ensure that he couldn’t, for example, “start a nuclear war by whistling into a pay phone”. Yes, really.

Four hands on one keyboard, from CSI: Cyber
Whistling into a phone to start a nuclear war? That makes CSI: Cyber seem realistic [watch].

The Future

Every decade’s hackers have debated whether or not the next decade’s have correctly interpreted their idea of “hacker ethics”. For me, Steven Levy’s tenets encompass them best:

  1. Access to computers – and anything which might teach you something about the way the world works – should be unlimited and total.
  2. All information should be free.
  3. Mistrust authority – promote decentralization.
  4. Hackers should be judged by their hacking, not bogus criteria such as degrees, age, race, or position.
  5. You can create art and beauty on a computer.
  6. Computers can change your life for the better.

Given these concepts as representative of hacker ethics, I’m convinced that hacking remains alive and well today. Hackers continue to be responsible for many of the coolest and most-important innovations in computing, and are likely to continue to do so. Unlike many other sciences, where progress over the ages has gradually pushed innovators away from backrooms and garages and into labs to take advantage of increasingly-precise generations of equipment, the tools of computer science are increasingly available to individuals. More than ever before, bedroom-based hackers are able to get started on their journey with nothing more than a basic laptop or desktop computer and a stack of freely-available open-source software and documentation. That progress may be threatened by the growth in popularity of easy-to-use (but highly locked-down) tablets and smartphones, but the barrier to entry is still low enough that most people can pass it, and the new generation of ultra-lightweight computers like the Raspberry Pi are doing their part to inspire the next generation of hackers, too.

That said, and as much as I personally love and identify with the term “hacker”, the hacker community has never been less in-need of this overarching label. The diverse variety of types of technologist nowadays coupled with the infiltration of pop culture by geek culture has inevitably diluted only to be replaced with a multitude of others each describing a narrow but understandable part of the hacker mindset. You can describe yourself today as a coder, gamer, maker, biohacker, upcycler, cracker, blogger, reverse-engineer, social engineer, unconferencer, or one of dozens of other terms that more-specifically ties you to your community. You’ll be understood and you’ll be elegantly sidestepping the implications of criminality associated with the word “hacker”.

The original meaning of “hacker” has also been soiled from within its community: its biggest and perhaps most-famous advocate‘s insistence upon linguistic prescriptivism came under fire just this year after he pushed for a dogmatic interpretation of the term “sexual assault” in spite of a victim’s experience. This seems to be absolutely representative of his general attitudes towards sex, consent, women, and appropriate professional relationships. Perhaps distancing ourselves from the old definition of the word “hacker” can go hand-in-hand with distancing ourselves from some of the toxicity in the field of computer science?

(I’m aware that I linked at the top of this blog post to the venerable but also-problematic Eric S. Raymond; if anybody can suggest an equivalent resource by another author I’d love to swap out the link.)

Verdict: The word “hacker” has become so broad in scope that we’ll never be able to rein it back in. It’s tainted by its associations with both criminality, on one side, and unpleasant individuals on the other, and it’s time to accept that the popular contemporary meaning has won. Let’s find new words to define ourselves, instead.

Supply-Chain Security and Trust

This article is a repost promoting content originally published elsewhere. See more things Dan's reposted.

Can we solve [the problem of supply-chain attacks] by building trustworthy systems out of untrustworthy parts?

It sounds ridiculous on its face, but the Internet itself was a solution to a similar problem: a reliable network built out of unreliable parts. This was the result of decades of research. That research continues today, and it’s how we can have highly resilient distributed systems like Google’s network even though none of the individual components are particularly good. It’s also the philosophy behind much of the cybersecurity industry today: systems watching one another, looking for vulnerabilities and signs of attack.

Security is a lot harder than reliability. We don’t even really know how to build secure systems out of secure parts, let alone out of parts and processes that we can’t trust and that are almost certainly being subverted by governments and criminals around the world. Current security technologies are nowhere near good enough, though, to defend against these increasingly sophisticated attacks. So while this is an important part of the solution, and something we need to focus research on, it’s not going to solve our near-term problems.

Schneier provides a great summary of the state of play with nation-state supply-chain attacks, using the Huawei 5G controversy as a jumping-off point but with reference to the fact that China are far from the only country that weaken the security and privacy of the world’s citizens in order to gain an international spying advantage. He goes on to explain what he sees as the two broad schools of thought are in providing technical solutions to this class of problems, and demonstrates that both are for the time being beyond our reach. The excerpt above comes from his examination of the second school of thought, and it’s a pretty-compelling illustration of why this is a different class of problem that the ones we’ve used to build a reliable Internet.

(Many of the comments are very good, too.)

Protecting Yourself from Identity Theft

This article is a repost promoting content originally published elsewhere. See more things Dan's reposted.

The reality is that your sensitive data has likely already been stolen, multiple times. Cybercriminals have your credit card information. They have your social security number and your mother’s maiden name. They have your address and phone number. They obtained the data by hacking any one of the hundreds of companies you entrust with the data­ — and you have no visibility into those companies’ security practices, and no recourse when they lose your data.

Given this, your best option is to turn your efforts toward trying to make sure that your data isn’t used against you. Enable two-factor authentication for all important accounts whenever possible. Don’t reuse passwords for anything important — ­and get a password manager to remember them all.

Bruce speaks my mind. Emphasis mine.

There is no longer any such thing as Computer Security

This article is a repost promoting content originally published elsewhere. See more things Dan's reposted.

Remember “cybersecurity”? Mysterious hooded computer guys doing mysterious hooded computer guy .. things! Who knows what kind of naughty digital mischief they might be up to? Unfortunately, we now live in a world where this kind of digital mischief is literally rewriting the world’s history. For proof of that, you need look no further than…

A good summary of the worst of the commonplace (non-spear) phishing attacks we’re seeing these days and why 2FA is positively, absolutely what you need (in addition to a password manager) these days.

Five-Eyes Intelligence Services Choose Surveillance Over Security

This article is a repost promoting content originally published elsewhere. See more things Dan's reposted.

The Five Eyes — the intelligence consortium of the rich English-speaking countries (the US, Canada, the UK, Australia, and New Zealand) — have issued a “Statement of Principles on Access to Evidence and Encryption” where they claim their needs for surveillance outweigh everyone’s needs for security and privacy. …the increasing use and sophistication of certain…

How many times must security professionals point out that there’s no such thing as a secure backdoor before governments actually listen? If you make a weakness in cryptography to make it easier for the “good guys” – your spies and law enforcement – then either (a) a foreign or enemy power will find the backdoor too, making everybody less-secure than before, or (b) people will use different cryptographic systems: ones which seem less-likely to have been backdoored.

Solving the information black hole is a challenging and important problem of our time. But backdoors surely aren’t the best solution, right?

What Cyber-War Will Look Like

This article is a repost promoting content originally published elsewhere. See more things Dan's reposted.

When prompted to think about the way hackers will shape the future of great power war, we are wont to imagine grand catastrophes: F-35s grounded by onboard computer failures, Aegis BMD systems failing to launch seconds before Chinese missiles arrive, looks of shock at Space Command as American surveillance satellites start careening towards the Earth–stuff like that. This is the sort of thing that fills the opening chapters of Peter Singer and August Cole’s Ghost Fleet. [1] The catastrophes I always imagine, however, are a bit different than this. The hacking campaigns I envision would be low-key, localized, and fairly low-tech. A cyber-ops campaign does not need to disable key weapon systems to devastate the other side’s war effort. It will be enough to increase the fear and friction enemy leaders face to tip the balance of victory and defeat. Singer and company are not wrong to draw inspiration from technological change; nor are they wrong to attempt to imagine operations with few historical precedents. But that isn’t my style. When asked to ponder the shape of cyber-war, my impulse is to look first at the kind of thing hackers are doing today and ask how these tactics might be applied in a time of war.

Mark Cancian thinks like I do.

In a report Cancian wrote for the Center for Strategic and International Studies on how great powers adapt to tactical and strategic surprise, Cancian sketched out twelve “vignettes” of potential technological or strategic shocks to make his abstract points a bit more concrete. Here is how Cancian imagines an “asymmetric cyber-attack” launched by the PRC against the United States Military:

 The U.S. secretary of defense had wondered this past week when the other shoe would drop.  Finally, it had, though the U.S. military would be unable to respond effectively for a while.

The scope and detail of the attack, not to mention its sheer audacity, had earned the grudging respect of the secretary. Years of worry about a possible Chinese “Assassin’s Mace”-a silver bullet super-weapon capable of disabling key parts of the American military-turned out to be focused on the wrong thing.

The cyber attacks varied. Sailors stationed at the 7th Fleet’ s homeport in Japan awoke one day to find their financial accounts, and those of their dependents, empty. Checking, savings, retirement funds: simply gone. The Marines based on Okinawa were under virtual siege by the populace, whose simmering resentment at their presence had boiled over after a YouTube video posted under the account of a Marine stationed there had gone viral. The video featured a dozen Marines drunkenly gang-raping two teenaged Okinawan girls. The video was vivid, the girls’ cries heart-wrenching the cheers of Marines sickening And all of it fake. The National Security Agency’s initial analysis of the video had uncovered digital fingerprints showing that it was a computer-assisted lie, and could prove that the Marine’s account under which it had been posted was hacked. But the damage had been done.

There was the commanding officer of Edwards Air Force Base whose Internet browser history had been posted on the squadron’s Facebook page. His command turned on him as a pervert; his weak protestations that he had not visited most of the posted links could not counter his admission that he had, in fact, trafficked some of them. Lies mixed with the truth. Soldiers at Fort Sill were at each other’s throats thanks to a series of text messages that allegedly unearthed an adultery ring on base.

The variations elsewhere were endless. Marines suddenly owed hundreds of thousands of dollars on credit lines they had never opened; sailors received death threats on their Twitter feeds; spouses and female service members had private pictures of themselves plastered across the Internet; older service members received notifications about cancerous conditions discovered in their latest physical.

Leadership was not exempt. Under the hashtag # PACOMMUSTGO a dozen women allegedly described harassment by the commander of Pacific command. Editorial writers demanded that, under the administration’s “zero tolerance” policy, he step aside while Congress held hearings.

There was not an American service member or dependent whose life had not been digitally turned upside down. In response, the secretary had declared “an operational pause,” directing units to stand down until things were sorted out.

Then, China had made its move, flooding the South China Sea with its conventional forces, enforcing a sea and air identification zone there, and blockading Taiwan. But the secretary could only respond weakly with a few air patrols and diversions of ships already at sea. Word was coming in through back channels that the Taiwanese government, suddenly stripped of its most ardent defender, was already considering capitulation.[2]

How is that for a cyber-attack?