The web loves data. Data about you. Data about who you are, about what you do, what you love doing, what you love eating.
…
I, on the other end, couldn’t care less about your data. I don’t run analytics on this website. I don’t care which articles you read, I don’t care if you read them. I don’t care about
which post is the most read or the most clicked. I don’t A/B test, I don’t try to overthink my content. I just don’t care.
…
Manu speaks my mind. Among the many hacks I’ve made to this site, I actively try not to invade on your privacy by collecting analytics, and I try not to let others to so
either!
My blog is for myself first and foremost (if you enjoy it too, that’s just a bonus). This leads to two conclusions:
If I’m the primary audience, I don’t need analytics (because I know who I am), and
I don’t want to be targeted by invasive analytics (and use browser extensions to block them, e.g. I by-default block all third-party scripts, delete cookies from non-allowlisted
domains 15 seconds after navigating away from sites, etc.); so I’d prefer them not to be on a site for which I’m the primary audience!
I’ve gone into more detail about this on my privacy page and hinted at it on my colophon. But I don’t know if anybody ever reads either
of those pages, of course!
Naturally, I was delighted, not least because it gives me an excuse to use the “deed poll” and “music” tags simultaneously on a post for the first time.
Don’t ask me what my “real” name is,
I’ve already told you what it was,
And I’m planning on burning my birth certificate.
The song’s about discovering and asserting self-identity through an assumed, rather than given, name. Which is fucking awesome.
Like virtually all of my sites, including this one, freedeedpoll.org.uk deliberately retains minimal logs and has no analytics tools. As a result, I have very little concept of how
popular it is, how widely it’s used etc., except when people reach out to me.
People do: I get a few emails every month from people who’ve got questions1,
or who are having trouble
getting their homemade deed poll accepted by troublesome banks. I’m happy to help them, but without additional context, I can’t be sure whether these folks represent the entirety of the
site’s users, a tiny fraction, or somewhere in-between.
So it’s obviously going to be a special surprise for me to have my website featured in a song.
I’ve been having a challenging couple of weeks2,
and it was hugely uplifting for me to bump into these appreciative references to my work in the wider Internet.
Footnotes
1 Common questions I receive are about legal gender recognition, about changing the names
of children, about changing one’s name while still a minor without parental consent, or about citizenship requirements. I’ve learned a lot about some fascinating bits of law.
This post is also available as an article. So
if you'd rather read a conventional blog post of this content, you can!
This is a video version of my blog post, Length Extension Attack. In it, I talk through the theory of length extension
attacks and demonstrate an SHA-1 length extension attack against an (imaginary) website.
This post is also available as a video. If you'd
prefer to watch/listen to me talk about this topic, give it a look.
Prefer to watch/listen than read? There’s a vloggy/video version of this post in which I explain all the
key concepts and demonstrate an SHA-1 length extension attack against an imaginary site.
I understood the concept of a length traversal
attack and when/how I needed to mitigate them for a long time before I truly understood why they worked. It took until work provided me an opportunity to play with one in practice (plus reading Ron Bowes’ excellent article on the subject) before I really grokked it.
You can check out the code and run it using the instructions in the repository if you’d like to play along.
Using hashes as message signatures
The site “Images R Us” will let you download images you’ve purchased, but not ones you haven’t. Links to the images are protected by a SHA-1 hash1, generated as follows:
When a “download” link is generated for a legitimate user, the algorithm produces a hash which is appended to the link. When the download link is clicked, the same process is followed
and the calculated hash compared to the provided hash. If they differ, the input must have been tampered with and the request is rejected.
Without knowing the secret key – stored only on the server – it’s not possible for an attacker to generate a valid hash for URL parameters of the attacker’s choice. Or is it?
Actually, it is possible for an attacker to manipulate the parameters. To understand how, you must first understand a little about how SHA-1 and its siblings actually work:
SHA-1‘s inner workings
The message to be hashed (SECRET_KEY + URL_PARAMS) is cut into blocks of a fixed size.2
The final block is padded to bring it up to the full size.3
A series of operations are applied to the first block: the inputs to those operations are (a) the contents of the block itself, including any padding, and (b) an initialisation
vector defined by the algorithm.4
The same series of operations are applied to each subsequent block, but the inputs are (a) the contents of the block itself, as before, and (b) the output of the previous
block. Each block is hashed, and the hash forms part of the input for the next.
The output of running the operations on the final block is the output of the algorithm, i.e. the hash.
In SHA-1, blocks are 512 bits long and the padding is a 1, followed by as many 0s as is necessary,
leaving 64 bits at the end in which to specify how many bits of the block were actually data.
Padding the final block
Looking at the final block in a given message, it’s apparent that there are two pieces of data that could produce exactly the same output for a given function:
The original data, (which gets padded by the algorithm to make it 64 bytes), and
A modified version of the data, which has be modified by padding it in advance with the same bytes the algorithm would; this must then be followed by an
additional block
In the case where we insert our own “fake” padding data, we can provide more message data after the padding and predict the overall hash. We can do this because
we the output of the first block will be the same as the final, valid hash we already saw. That known value becomes one of the two inputs into the function for the block that
follows it (the contents of that block will be the other input). Without knowing exactly what’s contained in the message – we don’t know the “secret key” used to salt it – we’re
still able to add some padding to the end of the message, followed by any data we like, and generate a valid hash.
Therefore, if we can manipulate the input of the message, and we know the length of the message, we can append to it. Bear that in mind as we move on to the other half
of what makes this attack possible.
Parameter overrides
“Images R Us” is implemented in PHP. In common with most server-side scripting languages,
when PHP sees a HTTP query string full of key/value pairs, if
a key is repeated then it overrides any earlier iterations of the same key.
It’d be tempting to simply override the download=free parameter in the query string at “Images R Us”, e.g. making it
download=free&download=valuable! But we can’t: not without breaking the hash, which is calculated based on the entire query string (minus the &key=...
bit).
But with our new knowledge about appending to the input for SHA-1 first a padding string, then an extra block containing our
payload (the variable we want to override and its new value), and then calculating a hash for this new block using the known output of the old final block as the
IV… we’ve got everything we need to put the attack together.
Putting it all together
We have a legitimate link with the query string download=free&key=ee1cce71179386ecd1f3784144c55bc5d763afcc. This tells us that somewhere on the server, this is
what’s happening:
If we pre-pad the string download=free with some special characters to replicate the padding that would otherwise be added to this final8 block, we can add a second block containing
an overriding value of download, specifically &download=valuable. The first value of download=, which will be the word free followed by
a stack of garbage padding characters, will be discarded.
And we can calculate the hash for this new block, and therefore the entire string, by using the known output from the previous block, like this:
Doing it for real
Of course, you’re not going to want to do all this by hand! But an understanding of why it works is important to being able to execute it properly. In the wild, exploitable
implementations are rarely as tidy as this, and a solid comprehension of exactly what’s happening behind the scenes is far more-valuable than simply knowing which tool to run and what
options to pass.
That said: you’ll want to find a tool you can run and know what options to pass to it! There are plenty of choices, but I’ve bundled one called hash_extender into my example, which will do the job pretty nicely:
hash_extender outputs the new signature, which we can put into the key=... parameter, and the new string that replaces download=free, including
the necessary padding to push into the next block and your new payload that follows.
Unfortunately it does over-encode a little: it’s encoded all the& and = (as %26 and %3d respectively), which isn’t what we
wanted, so you need to convert them back. But eventually you end up with the URL:
http://localhost:8818/?download=free%80%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%e8&download=valuable&key=7b315dfdbebc98ebe696a5f62430070a1651631b.
And that’s how you can manipulate a hash-protected string without access to its salt (in some circumstances).
Mitigating the attack
The correct way to fix the problem is by using a HMAC in place
of a simple hash signature. Instead of calling sha1( SECRET_KEY . urldecode( $params ) ), the code should call hash_hmac( 'sha1', urldecode( $params ), SECRET_KEY
). HMACs are theoretically-immune to length extension attacks, so long as the output of the hash function used is
functionally-random9.
Ideally, it should also use hash_equals( $validDownloadKey, $_GET['key'] ) rather than ===, to mitigate the possibility of a timing attack. But that’s another story.
Footnotes
1 This attack isn’t SHA1-specific: it works just as well on many other popular hashing algorithms too.
2 SHA-1‘s blocks are 64 bytes
long; other algorithms vary.
3 For SHA-1, the padding bits
consist of a 1 followed by 0s, except the final 8-bytes are a big-endian number representing the length of the message.
4 SHA-1‘s IV is 67452301 EFCDAB89 98BADCFE 10325476 C3D2E1F0, which you’ll observe is little-endian counting from 0 to
F, then back from F to 0, then alternating between counting from 3 to 0 and C to F. It’s
considered good practice when developing a new cryptographic system to ensure that the hard-coded cryptographic primitives are simple, logical, independently-discoverable numbers like
simple sequences and well-known mathematical constants. This helps to prove that the inventor isn’t “hiding” something in there, e.g. a mathematical weakness that depends on a
specific primitive for which they alone (they hope!) have pre-calculated an exploit. If that sounds paranoid, it’s worth knowing that there’s plenty of evidence that various spy
agencies have deliberately done this, at various points: consider the widespread exposure of the BULLRUN programme and its likely influence on Dual EC DRBG.
5 The padding characters I’ve used aren’t accurate, just representative. But there’s the
right number of them!
6 You shouldn’t do this: you’ll cause yourself many headaches in the long run. But you
could.
7 It’s also not always obvious which inputs are included in hash generation and how
they’re manipulated: if you’re actually using this technique adversarily, be prepared to do a little experimentation.
8 In this example, the hash operates over a single block, but the exact same principle
applies regardless of the number of blocks.
9 Imagining the implementation of a nontrivial hashing algorithm, the predictability of
whose output makes their HMAC vulnerable to a length extension attack, is left as an exercise for the reader.
It opens almost apologetically, like an explanation for the gap in new releases for most of the twenty-teens. But it quickly becomes a poetic exploration of a detached depression of a
man trapped under the weight of the world. It’s sad, and beautiful, and relatable.
This adventure began, in theory at least, on my birthday in January. I’ve long expressed an interest
in taking a dance class together, and so when Ruth pitched me a few options for a birthday gift, I jumped on the opportunity to learn tango. My knowledge of the dance was
basically limited to what I’d seen in films and television, but it had always looked like such an amazing dance: careful, controlled… synchronised, sexy.
After shopping around for a bit, Ruth decided that the best approach was for us to do a “beginners” video course in the comfort of our living room, and then take a weekend
getaway to do an “improvers” class.
After all, we’d definitely have time to complete the beginners’ course and get a lot of practice in before we had to take to the dance floor with a group of other “improvers”,
right?2
Okay, let me try again to enumerate you everything I actually know about tango3:
Essentials. A leader and follower4
hold one another’s upper torso closely enough that, with practice, each can intuit from body position where the other’s feet are without looking. While learning, you will not manage
to do this, and you will tread on one another’s toes.
The embrace. In the embrace, one side – usually the leader’s left – is “open”, with the dancers’ hands held; the other side is “closed”, with the dancers holding one
another’s bodies. Generally, you should be looking at one another or towards the open side. But stop looking at your feet: you should know where your own feet are by proprioception,
and you know where your partners’ feet are by guesswork and prayer.
The walk. You walk together, (usually) with opposite feet moving in-sync so that you can be close and not tread on one another’s toes, typically forward
(from the leader’s perspective) but sometimes sideways or even backwards (though not usually for long, because it increases the already-inevitable chance that you’ll collide
embarrassingly with other couples).
Movement. Through magic and telepathy a good connection with one another, the pair will, under the leader’s direction, open opportunities to perform more
advanced (but still apparently beginner-level) steps and therefore entirely new ways to mess things up. These steps include:
Forward ochos. The follower stepping through a figure-eight (ocho) on the closed side, or possibly the open side, but they probably forget which
way they were supposed to turn when they get there, come out on the wrong foot, and treat on the leader’s toes.
Backwards ochos. The follower moves from side to side or in reverse through a series of ochos, until the leader gets confused which way they’re
supposed to pivot to end the maneuver and both people become completely confused and
unstuck.
The cross. The leader walks alongside the follower, and when the leader steps back the follower chooses to assume that the leader intended for them to cross their
legs, which opens the gateway to many other steps. If the follower guesses incorrectly, they probably fall over during that step. If the follower guesses correctly but forgets
which way around their feet ought to be, they probably fall over on the very next step. Either way, the leader gets confused and does the wrong thing next.
Giros. One or both partners perform a forwards step, then a sideways step, then a backwards step, then another sideways step, starting on the inside leg
and pivoting up to 270° with each step such that the entire move rotates them some portion of a complete circle. In-sync with one another, of course.
Sacadas. Because none of the above are hard enough to get right together, you should start putting your leg out between your partner’s leg and try and
trip them up as they go. They ought to know you’re going to do this, because they’ve got perfect predictive capabilities about where your feet are going to end, remember?
Also remember to use the correct leg, which might not be the one you expect, or you’ll make a mess of the step you’ll be doing in three beats’ time. Good luck!
Barridas and mordidas. What, you finished the beginners’ course? Too smart to get tripped up by your partner’s sacada any more? Well
now it’s time to start kicking your partner’s feet out from directly underneath them. That’ll show ’em.
Style. All of the above should be done gracefully, elegantly, with perfect synchronicity and in time with the music… oh, and did I mention you should be able to
improve the whole thing on the fly, without pre-communication with your partner. 😅
Ultimately, it was entirely our own fault we felt out-of-our-depth up in Edinburgh at the weekend. We tried to run before we could walk, or – to put it another way – to milonga
before we could caminar.
A somewhat-rushed video course and a little practice on carpet in your living room is not a substitute for a more-thorough práctica on a proper-sized dance floor, no matter how
often you and your partner use any excuse of coming together (in the kitchen, in an elevator, etc.) to embrace and walk a couple of steps! Getting a hang of the fluid connections and
movement of tango requires time, and practice, and discipline.
But, not least because of our inexperience, we did learn a lot during our weekend’s deep-dive. We got to watch (and, briefly, partner with) some much better dancers
and learned some advanced lessons that we’ll doubtless reflect back upon when we’re at the point of being ready for them. Because yes: we are continuing! Our next step is a
Zoom-based lesson, and then we’re going to try to find a more-local group.
Also, we enjoyed the benefits of some one-on-one time with Jenny and Ricardo, the amazingly friendly and supportive teachers whose video course got us started and whose
in-person event made us feel out of our depth (again: entirely our own fault).
If you’ve any interest whatsoever in learning to dance tango, I can wholeheartedly recommend Ricardo and Jenny Oria as teachers. They run
courses in Edinburgh and occasionally elsewhere in the UK as well as providing online resources, and they’re the most amazingly
supportive, friendly, and approachable pair imaginable!
Just… learn from my mistake and start with a beginner course if you’re a beginner, okay? 😬
Footnotes
1 I’m exaggerating how little I know for effect. But it might not be as much of an
exaggeration as you’d hope.
4 Tango’s progressive enough that it’s come to reject describing the roles in binary
gendered terms, using “leader” and “follower” in place of what was once described as “man” and “woman”, respectively. This is great for improving access to pairs of dancers who don’t
consist of a man and a woman, as well as those who simply don’t want to take dance roles imposed by their gender.
For my examples below, assume a three-person family. I’m using unrealistic numbers for easy arithmetic.
Alice earns £2,000, Bob earns £1,000, and Chris earns £500, for a total household income of £3,500.
Alice spends £1,450, Bob £800, and Chris £250, for a total household expenditure of £2,500.
Model #1: Straight Split
We’ve never done things this way, but for completeness sake I’ll mention it: the simplest way that households can split their costs is by dividing them between the participants equally:
if the family make a £60 shopping trip, £20 should be paid by each of Alice, Bob, and Chris.
My example above shows exactly why this might not be a smart choice: this model would have each participant contribute £833.33 over the course of the month, which is more than Chris
earned. If this month is representative, then Chris will gradually burn through their savings and go broke, while Alice will put over a grand into her savings account every month!
Model #2: Income-Assessed
We’re a bunch of leftie socialist types, and wanted to reflect our political outlook in our household finances, too. So rather than just splitting our costs equally between us, we
initially implemented a means-assessment system based on the relative differences between our incomes. The thinking was that somebody that earns twice as much should
contribute twice as much towards the costs of running the household.
Using our example family above, here’s how that might look:
Alice earned 57% of the household income, so she should have contributed 57% of the household costs: £1,425. She overpaid by £25.
Bob earned 29% of the household income, so he should have contributed 29% of the household costs: £725. He overpaid by £75.
Chris earned 14% of the household income, so they should have contributed 14% of the household costs: £350. They underpaid by £100.
Therefore, at the end of the month Chris should settle up by giving £25 to Alice and £75 to Bob.
By analogy: The “Income-Assessed” model is functionally equivalent to splitting each and every expense according to the participants income – e.g. if a £100 bill landed
on their doormat, Alice would pay £57, Bob £29, and Chris £14 of it – but has the convenience that everybody just pays for things “as they go along” and then square everything up when
their paycheques come in.
Over time, our expenditures grew and changed and our incomes grew, but they didn’t do so in an entirely simple fashion, and we needed to make some tweaks to our income-assessed model of
household finance contributions. For example:
Gross vs Net Income: For a while, some of our incomes were split into a mixture of employed income (on which income tax was paid as-we-earned) and self-employed
income (for which income tax would be calculated later), making things challenging. We agreed that net income (i.e. take-home pay) was the correct measure for us to use for the
income-based part of the calculation, which also helped keep things fair as some of us began to cross into and out of the higher earner tax bracket.
Personal Threshold: At times, a subset of us earned a disproportionate portion of the household income (there were short periods where one of us earned over 50% of
the household income; at several other times two family members each earned thrice that of the third). Our costs increased too, but this imposed an regressive burden on the
lower-earner(s), for whom those costs represented a greater proportion of their total income. To attempt to mitigate this, we introduced a personal threshold somewhat analogous to the
income tax “personal allowance” (the policy that means that you don’t pay tax on your first £12,570 of income).
Eventually, we came to see that what we were doing was trying to patch a partially-broken system, and tried something new!
Model #3: Same-Residual
In 2022, we transitioned to a same-residual system that attempts to share out out money in an even-more egalitarian way. Instead of each person contributing in accordance
with their income, the model attempts to leave each person with the same average amount of disposable personal income at the end. The difference is most-profound where the
relative incomes are most-diverse.
With the example family above, that would mean:
The household earned £3,500 and spent £2,500, leaving £1,000. Dividing by 3 tells us that each person should have £333.33 after settling up.
Alice earned earned £2,000 and spent £1,450, so she has £550 left. That’s £216.67 too much.
Bob earned earned £1,000 and spent £800, so she has £200 left. That’s £133.33 too little.
Chris earned earned £500 and spent £250, so she has £250 left. That’s £83.33 too little.
Therefore, at the end of the month Alice should settle up by giving £133.33 to Bob and £83.33 to Chris (note there’s a 1p rounding error).
That’s a very different result than the Income-Assessed calculation came up with for the same family! Instead of Chris giving money to Alice and Bob, because those two
contributed to household costs disproportionately highly for their relative incomes, Alice gives money to Bob and Chris, because their incomes (and expenditures) were much lower.
Ignoring any non-household costs, all three would expect to have the same bank balance at the start of the month as at the end, after settlement.
By analogy: The “Same-Residual” model is functionally equivalent to having everybody’s salary paid into a shared bank account, out of which all household expenditures
are paid, and at the end of the month everything that’s left in the bank account gets split equally between the participants.
We’ve made tweaks to this model, too, of course. For example: we’ve set a “target” residual and, where we spend little enough in a month that we would each be eligible for more
than that, we instead sweep the excess into our family savings account. It’s a nice approach to help build up a savings reserve without feeling a pinch.
I’m sure our model will continue to evolve, as it has for the last decade and a half, but for now it seems stable, fair, and reasonable. Maybe it’ll work for your household too (whether
or not you’re also a polyamorous family!): take a look at the spreadsheet in Google Drive and give it a go.
Semi-inspired by a similar project by Kev Quirk,
I’ve got a project I want to run on my blog in 2024.
I want you to be my pen pal for a month. Get in touch by emailing penpals@danq.me or any other way you like and let’s do this!
I don’t know much about the people who read my blog, whether they’re ad-hoc visitors or regular followers1.
So here’s the plan: I’m looking to do is to fill a “dance card” of interesting people each of with whom I’ll “pen pal” for a month.
The following month, I’ll blog about the experience: who I met, what I learned about them, what I learned about myself. Have a look below and see if there’s a slot for you: I’d love to
chat to you about, well – anything!
My goals:
Get inspired to blog about new/different things (and hopefully help inspire others to do the same).
Connect with a dozen folks on a more-interpersonal level than I normally do via my blog.
Maybe even make, or deepen, some friendships!
The “rules”:
Aiming for at least 3 email exchanges over a month. Maybe more.2
There’s no specific agenda: I promise to bring what I’ve been thinking about and working on, and possibly a spicy conversation-starter from LetsLifeChat.com. You bring whatever you like. No topic is explicitly off the table unless somebody says it is (which anybody can do
at any time, for any or no reason).
I’ll blog a summary of my experience the month afterwards, but I won’t share anything without permission. I’ll happily share an unpublished draft with each penpal first so they
can veto any bits they don’t like. I’ll refer to you by whatever name, link etc. suits you best.
If you have a blog/digital garden/social presence of any kind, you’re welcome to blog about it too. Or not: entirely up to you!
You! If you’re reading this, you’re probably somebody I want to meet! But I’d be especially interested in penpalling with people who tick one or more of the following
boxes:
Personal bloggers at the edges of or just outside my usual social circles. Maybe you’re an IndieWeb, RSS Club, or Geminispace explorer?
Regular readers, whether you just skim the post titles and dive in once in a blue moon or read every post and comment on the things you care about.
Automatticians from parts of the company I don’t get to interact with. Let’s build some bridges!
People whose interests overlap with mine in any way, large or small. That overlap might be technology (web standards, accessibility, security, blogging, open
source…), hobbies (GPS sports, board games, magic, murder mysteries, science fiction, getting lost on Wikipedia…),
volunteering (third sector support, tech for good, diversity in tech…), social (queer issues, polyamory, socialism…), or something else entirely.
Missed connections. Did we meet briefly or in-passing (conferences, meetups, friends-of-friends, overlapping volunteering circles) but not develop anything further?
I’d love to pick up where we left off!
Distant- and nearly-friends. Did we drift apart long ago, or never quite move into one another’s orbit in the first place? This could be your excuse to touch bases!
If you read this far and didn’t email penpals@danq.me yet, go do that. I’m looking forward to hearing from you!
Footnotes
1 Not-knowing who reads my blog might come at least in part from the fact that I actively sabotage any plugin that might give me any analytics! One might say I’ve shot myself in the foot, there.
2 If we stay in touch afterwards that’s fine too, but it’s not essential.
3 I’m looking for longer-form, but slower, communication than you get via e.g. instant
messengers and whatnot: a more “penpal” experience.
✅ To-Do:Obsidian, physical notepad [not happy with this; want something more productive]
📆 Calendar: Google Calendar (via Thunderbird on Desktop) [not happy with this; want something not-Google – still waiting on Proton Calendar getting good!]