Basilisk collection

This article is a repost promoting content originally published elsewhere. See more things Dan's reposted.

Basilisk collection

The basilisk collection (also known as the basilisk file or basilisk.txt) is a collection of over 125 million partial hash inversions of the SHA-256 cryptographic hash function. Assuming state-of-the art methods were used to compute the inversions, the entries in the collection collectively represent a proof-of-work far exceeding the computational capacity of the human race.[1][2] The collection was released in parts through BitTorrent beginning in June 2018, although it was not widely reported or discussed until early 2019.[3] On August 4th, 2019 the complete collection of 125,552,089 known hash inversions was compiled and published by CryTor, the cybersecurity lab of the University of Toronto.[4]

The existence of the basilisk collection has had wide reaching consequences in the field of cryptography, and has been blamed for catalyzing the January 2019 Bitcoin crash.[2][5][6]

Electronic Frontier Foundation cryptographer Brian Landlaw has said that “whoever made the basilisk is 30 years ahead of the NSA, and the NSA are 30 years ahead of us, so who is there left to trust?”[35]

This is fucking amazing, on a par with e.g. First on the Moon.

Presented in the style of an alternate-reality Wikipedia article, this piece of what the author calls “unfiction” describes the narratively believable-but-spooky (if theoretically unlikely from a technical standpoint) 2018 disclosure of evidence for a new presumed mathematical weakness in the SHA-2 hash function set. (And if that doesn’t sound like a good premise for a story to you, I don’t know what’s wrong with you! 😂)

Cryptographic weaknesses that make feasible attacks on hashing algorithms are a demonstrably real thing. But even with the benefit of the known vulnerabilities in SHA-2 (meet-in-the-middle attacks that involve up-to-halving the search space by solving from “both ends”, plus deterministic weaknesses that make it easier to find two inputs that produce the same hash so long as you choose the inputs carefully) the “article” correctly states that to produce a long list of hash inversions of the kinds described, that follow a predictable sequence, might be expected to require more computer processing power than humans have ever applied to any problem, ever.

As a piece of alternate history science fiction, this piece not only provides a technically-accurate explanation of its premises… it also does a good job of speculating what the impact on the world would have been of such an event. But my single favourite part of the piece is that it includes what superficially look like genuine examples of what a hypothetical basilisk.txt would contain. To do this, the author wrote a brute force hash finder and ran it for over a year. That’s some serious dedication. For those that were fooled by this seemingly-convincing evidence of the realism of the piece, here’s the actual results of the hash alongside the claimed ones (let this be a reminder to you that it’s not sufficient to skim-read your hash comparisons, people!):


claimed: 00000000000161b9f84a187cc21b1752bf678bdd4d643c17b3b786684d8e9f17
 actual: 0000000000000000000000161b9f84a187cc21b172bf68b3cb3b78684d8e9f17


claimed: 0000000000000000000000cee5fe5df2d3034fff435bb40e8651a18d69e81460
 actual: 0000000000cee5fe5df2d3034fff435bb4232f21c2efce0e8651a18d69e81460


claimed: 000000000000000000000012aabd8d935757db173d5b3e7ae0f25ea4eb775402
 actual: 000000000012aabd8d935757db173d5b3ec6d38330926f7ae0f25ea4eb775402


claimed: 000000000000000000000039d50bb560770d051a3f5a2fe340c99f81e18129d1
 actual: 000000000039d50bb560770d051a3f5a2ffa2281ac3287e340c99f81e18129d1


claimed: 00000000000000000000002ca8fc4b6396dd5b5bcf5fa80ea49967da55a8668b
 actual: 00000000002ca8fc4b6396dd5b5bcf5fa82a867d17ebc40ea49967da55a8668b

Anyway: the whole thing is amazing and you should go read it.

1 reply to Basilisk collection

Leave a Reply

Your email address will not be published. Required fields are marked *