Will Smith’s concert crowds are real, but AI is blurring the lines

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

This is the age we’re shifting into: an era in which post-truth politics and deepfake proliferation means that when something looks “a bit off”, we assume (a) it’s AI-generated, and (b) that this represents a deliberate attempt to mislead. (That’s probably a good defence strategy nowadays in general, but this time around it’s… more-complicated…)

So if these fans aren’t AI-generated fakes, what’s going on here?

The video features real performances and real audiences, but I believe they were manipulated on two levels:

  1. Will Smith’s team generated several short AI image-to-video clips from professionally-shot audience photos
  2. YouTube post-processed the resulting Shorts montage, making everything look so much worse

I put them side-by-side below. Try going full-screen and pause at any point to see the difference. The Instagram footage is noticeably better throughout, though some of the audience clips still have issues.

The Internet’s gone a bit wild over the YouTube video of Will Smith with a crowd. And if you look at it, you can see why: it looks very much like it’s AI-generated. And there’d be motive: I mean, we’ve already seen examples where politicians have been accused (falsely, by Trump, obviously) of using AI to exaggerate the size of their crowds, so it feels believable that a musician’s media team might do the same, right?

But yeah: it turns out that isn’t what happened here. Smith’s team did use AI, but only to make sign-holding fans from other concerts on the same tour appear to all be in the same place. But the reason the video “looks AI-generated” is because… YouTube fucked about with it!

It turns out that YouTube have been secretly experimenting with upscaling shorts, using AI to add detail to blurry elements. You can very clearly see the effect in the video above, which puts the Instagram and YouTube versions of the video side-by-side (of course, if YouTube decide to retroactively upscale this video then the entire demonstration will be broken anyway, but for now it works!). There are many points where a face in the background is out-of-focus in the Instagram version, but you can see in the YouTube version it’s been brought into focus by adding details. And some of those details look a bit… uncanny valley.

Every single bit of this story – YouTube’s secret experiments on creator videos, AI “enhancement” which actually makes things objectively worse, and the immediate knee-jerk reaction of an understandably jaded and hypersceptical Internet to the result – just helps cement that we truly do live in the stupidest timeline.

Four perspectives on AI

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

A two-axis graph labeled 'Beliefs about AI' with the x-axis of 'Transformative' and the y-axis of 'Positive'. The field is divided into four quadrants: 'Skeptical' in the bottom-left, 'Wary' in the bottom-right, 'Pragmatist' in the top-left, and 'Optimist' in the top-right.

I’ve grouped these four perspectives, but everything here is a spectrum. Depending on the context or day, you might find yourself at any point on the graph. And I’ve attempted to describe each perspectively [sic] generously, because I don’t believe that any are inherently good or bad. I find myself switching between perspectives throughout the day as I implement features, use tools, and read articles. A good team is probably made of members from all perspectives.

Which perspective resonates with you today? Do you also find yourself moving around the graph?

An interesting question from Sean McPherson. He sounds like he’s focussed on LLMs for software development, for which I’ve drifted around a little within the left-hand-side of the graph. But perhaps right now, this morning, you could simplify my feelings like this:

The same graph with a hand-drawn arrow moving from the Skeptical to the Pragmatic quadrants.My stance is that AI-assisted coding can be helpful (though the question remains open about whether it’s “worth it”), so long as you’re not trying to do anything that you couldn’t do yourself, and you know how you’d go about doing it yourself. That is: it’s only useful to accelerate tasks that are in your “known knowns” space.

As I’ve mentioned: the other week I had a coding AI help me with some code that interacted with the Google Sheets API. I know exactly how I’d go about it, but that journey would have to start with re-learning the Google Sheets API, getting an API key and giving it the appropriate permissions, and so on. That’s the kind of task that I’d be happy to outsource to a less-experienced programmer who I knew would bring a somewhat critical eye for browsing StackOverflow, and then give them some pointers on what came back, so it’s a fine candidate for an AI to step in and give it a go. Plus: I’d be treating the output as “legacy code” from the get-go, and (because the resulting tool was only for my personal use) I wasn’t too concerned with the kinds of security and accessibility considerations that GenAI can often make a pig’s ear of. So I was able to palm off the task onto Claude Sonnet and get on with something else in the meantime.

If I wanted to do something completely outside of my wheelhouse: say – “write a program in Fortran to control a robot arm” – an AI wouldn’t be a great choice. Sure, I could “vibe code” something like that, but I’d have no idea whether what it produced was any good! It wouldn’t even be useful as a springboard to learning how to do that, because I don’t have the underlying fundamentals in robotics nor Fortran. I’d be producing AI slop in software form: the kind of thing that comes out when non-programmers assume that AI can completely bridge the gap between their great business idea and a fully working app!

The latest episode of South Park kinda nailed parodying the unrealistic expectations that some folks seem to put on generative AI: treating it as intelligent or as a friend is unhealthy and dangerous!

They’ll get a prototype that seems to do what you want, if you squint just right, but the hard part of software engineering isn’t making a barebones proof-of-concept! That’s the easy bit! (That’s why AI can do it pretty well!) The hard bit is making it work all the time, every time; making it scale; making it safe to use; making it maintainable; making it production-ready… etc.

But I do benefit from coding AI sometimes. GenAI’s good at summarisation, which in turn can make it good at relatively-quickly finding things in a sprawling codebase where your explanation of those things is too-woolly to use a conventional regular expression search. It’s good at generating boilerplate that’s broadly-like examples its seen before, which means it can usually be trusted to put together skeleton applications. It’s good at “guessing what comes next” – being, as it is, “fancy autocomplete” – which means it can be helpful for prompting you for the right parameters for that rarely-used function or for speculating what you might be about to do with the well-named variable you just created.

Anyway: Sean’s article was pretty good, and it’s a quick and easy read. Once you’ve read it, perhaps you’ll share where you think you sit, on his diagram?

× ×

Just a Little More Context Bro, I Promise, and It’ll Fix Everything

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

Solving problems with LLMs is like solving front-end problems with NPM: the “solution” comes through installing more and more things — adding more and more context, i.e. more and more packages.

  • LLM: Problem? Add more context.
  • NPM: Problem? There’s a package for that.

As I’m typing this, I’m thinking of that image of the evolution of the Raptor engine, where it evolved in simplicity:

Photograph of three versions of the raptor engine, each one getting progressively simplified in mechanical parts.

This stands in contrast to my working with LLMs, which often wants more and more context from me to get to a generative solution:

Photograph of three versions of the raptor engine, but the image is reversed showing the engine get progressively complicated in mechanical parts over time. Each engine represents an LLM prompt.

Jim Nielsen speaks to my experience, here. Because a programming LLM is simply taking inputs (all of your code, plus your prompt), transforming it through statistical analysis, and then producing an output (replacement code), it struggles with refactoring for simplicity unless very-carefully controlled. “Vibe coding” is very much an exercise in adding hacks upon hacks… like the increasingly-ludicrous epicycles introduced by proponents of geocentrism in its final centuries before the heliocentric model became fully accepted.

Geocentric representation of the apparent motion of the Sun, Mercury, and Venus from the Earth, based on 15th century diagrams. It consists of many looping spirals approaching and then withdrawing from the Earth as they orbit around it.
This mess used to be how many perfectly smart people imagined the movements of the planets. When observations proved it couldn’t be right, they’d just add more complexity to catch the edge cases.

I don’t think that AIs are useless as a coding tool, and I’ve successfully used them to good effect on several occasions. I’ve even tried “vibe coding”, about which I fully agree with Steve Krouse‘s observation that “vibe code is legacy code”. Being able to knock out something temporary, throwaway, experimental, or for personal use only… while I work on something else… is pretty liberating.

For example: I couldn’t remember my Google Sheets API and didn’t want to re-learn it from the sprawling documentation site, but wanted a quick personal tool to manipulate such a sheet from a remote system. I was able to have an AI knock up what I needed while I cooked dinner for the kids, paying only enough attention to check-in on its work. Is it accessible? Is it secure? Is it performant? Is it maintainable? I can’t answer any of those questions, and so as a professional software engineer I have to reasonably assume the answer to all of them is “no”. But its only user is me, it does what I needed it to do, and I didn’t have to shift my focus from supervising children and a pan in order to throw it together!

Anyway: Jim hits the nail on the head here, as he so often does.

× × ×

The rise of Whatever

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

A freaking excellent longread by Eevee (Evelyn Woods), lamenting the direction of popular technological progress and general enshittification of creator culture. It’s ultimately uplifting, I feel, but it’s full of bitterness until it gets there. I’ve pulled out a couple of highlights to try to get you interested, but you should just go and read the entire thing:

And so the entire Web sort of congealed around a tiny handful of gigantic platforms that everyone on the fucking planet is on at once. Sometimes there is some sort of partitioning, like Reddit. Sometimes there is not, like Twitter.

That’s… fine, I guess. Things centralize. It happens. You don’t get tubgirl spam raids so much any more, at least.

But the centralization poses a problem. See, the Web is free to look at (by default), but costs money to host. There are free hosts, yes, but those are for static things getting like a thousand visitors a day, not interactive platforms serving a hundred million. That starts to cost a bit. Picture logs being shoveled into a steam engine’s firebox, except it’s bundles of cash being shoveled into… the… uh… website hole.

I don’t want to help someone who opens with “I don’t know how to do this so I asked ChatGPT and it gave me these 200 lines but it doesn’t work”. I don’t want to know how much code wasn’t actually written by anyone. I don’t want to hear how many of my colleagues think Whatever is equivalent to their own output.

I glimpsed someone on Twitter a few days ago, also scoffing at the idea that anyone would decide not to use the Whatever machine. I can’t remember exactly what they said, but it was something like: “I created a whole album, complete with album art, in 3.5 hours. Why wouldn’t I use the make it easier machine?”

This is kind of darkly fascinating to me, because it gives rise to such an obvious question: if anyone can do that, then why listen to your music? It takes a significant chunk of 3.5 hours just to listen to an album, so how much manual work was even done here? Apparently I can just go generate an endless stream of stuff of the same quality! Why would I want your particular brand of Whatever?

Nobody seems to appreciate that if you can make a computer do something entirely on its own, then that becomes the baseline.

Do things. Make things. And then put them on your website so I can see them.

Clearly this all ties in to stuff that I’ve been thinking, lately. Expect more posts and reposts in this vein, I guess?

ArtificialCast

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

Type-safe transformation powered by inference.

ArtificialCast is a lightweight, type-safe casting and transformation utility powered by large language models. It allows seamless conversion between strongly typed objects using only type metadata, JSON schema inference, and prompt-driven reasoning.

Imagine a world where Convert.ChangeType() could transform entire object graphs, infer missing values, and adapt between unrelated types – without manual mapping or boilerplate.

ArtificialCast makes that possible.

Features

  • Zero config – Just define your types.
  • Bidirectional casting – Cast any type to any other.
  • Schema-aware inference – Auto-generates JSON Schema for the target type.
  • LLM-powered transformation – Uses AI to “fill in the blanks” between input and output.
  • Testable & deterministic-ish – Works beautifully until it doesn’t.

As beautiful as it is disgusting, this C# is fully-functional and works exactly as described… and yet you really, really should never use it (which its author will tell you, too).

Casting is the process of transforming a variable of one type into one of another. So for example you might cast the number 3 into a string and get "3" (though of course this isn’t the only possible result: "00000011" might also be a valid representation, depending on the circumstances1).

Casting between complex types defined by developers is harder and requires some work. Suppose you have a User model with attributes like “username”, “full name”, “hashed password”, “email address” etc., and you want to convert your users into instances of a new model called Customer. Some of the attributes will be the same, some will be absent, and some will be… different (e.g. perhaps a Customer has a “first name” and “last name” instead of a “full name”, and it’s probably implemented wrong to boot).

The correct approach is to implement a way to cast one as the other.

The very-definitely incorrect approach is to have an LLM convert the data for you. And that’s what this library provides.

ArtificialCast is a demonstration of what happens when overhyped AI ideas are implemented exactly as proposed – with no shortcuts, no mocking, and no jokes.

It is fully functional. It passes tests. It integrates into modern .NET workflows. And it is fundamentally unsafe.

This project exists because:

  • AI-generated “logic” is rapidly being treated as production-ready.
  • Investors are funding AI frameworks that operate entirely on structure and prompts.
  • Developers deserve to see what happens when you follow that philosophy to its logical conclusion.

ArtificialCast is the result.

It works. Until it doesn’t. And when it doesn’t, it fails in ways that look like success. That’s the danger.

I’ve played with AI in code a few times. There are some tasks it’s very good at, like summarising and explaining (when the developer before you didn’t leave a sufficiency of quality comments). There are some tasks it can be okay at, with appropriate framing and support: like knowing its way around unfamiliar-to-you but well-documented APIs2.

But if you ask an AI to implement an entire product or even just a significant feature from scratch, unsupervised, you’re at risk of rapidly hitting the realm of Heisenbugs, security vulnerabilities, and enormous redundancies.

This facetious example – of using AI as a universal typecasting engine – helps hammer that point home, and I love it.

Footnotes

1 How to cast basic types isn’t entirely standardised: PHP infamously casts the string "0" as false when it’s coerced into a boolean, which virtually no other programming language does, for example.

2 The other week, I had a GenAI help me write some code that writes to a Google Sheets document, because I was fuzzy on the API and knew the AI would pick it up faster than me while I wrote the code “around” it.

Adding a feature because ChatGPT incorrectly thinks it exists

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

Our scanning system wasn’t intended to support this style of notation. Why, then, were we being bombarded with so many ASCII tab ChatGPT screenshots? I was mystified for weeks — until I messed around with ChatGPT myself and got this:

Screenshot of ChatGPT telling users to enter this ASCII tab into soundslice.com

Turns out ChatGPT is telling people to go to Soundslice, create an account and import ASCII tab in order to hear the audio playback. So that explains it!

With ChatGPT’s inclination to lie about the features of a piece of technology, it was only a matter of time before a frustrated developer actually added a feature that ChatGPT had imagined, just to stop users from becoming dissatisfied when they tried to use nonexistent tools that ChatGPT told them existed.

And this might be it! This could be the very first time that somebody’s added functionality based on an LLM telling people the feature existed already.

Adrian Holovaty runs a tool that can “read” scanned sheet music and provide a digital representation to help you learn how to play it. But after ChatGPT started telling people that his tool could also read ASCII-formatted guitar tablature, he went and implemented it!

His blog post’s got more details, and it’s worth a read. This could be a historic moment that we’ll look back on!

×

The Huge Grey Area in the Anthropic Ruling

This week, AI firm Anthropic (the folks behind Claude) found themselves the focus of attention of U.S. District Court for the Northern District of California.

New laws for new technologies

The tl;dr is: the court ruled that (a) piracy for the purpose of training an LLM is still piracy, so there’ll be a separate case about the fact that Anthropic did not pay for copies of all the books their model ingested, but (b) training a model on books and then selling access to that model, which can then produce output based on what it has “learned” from those books, is considered transformative work and therefore fair use.

Fragment of court ruling with a line highlighted that reads: This order grants summary judgment for Anthropic that the training use was a fair use.

Compelling arguments have been made both ways on this topic already, e.g.:

  • Some folks are very keen to point out that it’s totally permitted for humans to read, and even memorise, entire volumes, and then use what they’ve learned when they produce new work. They argue that what an LLM “does” is not materially different from an impossibly well-read human.
  • By way of counterpoint, it’s been observed that such a human would still be personally liable if the “inspired” output they subsequently created was derivative to the point of  violating copyright, but we don’t yet have a strong legal model for assessing AI output in the same way. (BBC News article about Disney & Universal vs. Midjourney is going to be very interesting!)
  • Furthermore, it might be impossible to conclusively determine that the way GenAI works is fundamentally comparable to human thought. And that’s the thing that got me thinking about this particular thought experiment.

A moment of philosophy

Here’s a thought experiment:

Support I trained an LLM on all of the books of just one author (plus enough additional language that it was able to meaningfully communicate). Let’s take Stephen King’s 65 novels and 200+ short stories, for example. We’ll sell access to the API we produce.

Monochrome photograph showing a shelf packed full of Stephen King's novels.
I suppose it’s possible that Stephen King was already replaced long ago with an AI that was instructed to churn out horror stories about folks in isolated Midwestern locales being harassed by a pervasive background evil?

The output of this system would be heavily-biased by the limited input it’s been given: anybody familiar with King’s work would quickly spot that the AI’s mannerisms echoed his writing style. Appropriately prompted – or just by chance – such a system would likely produce whole chapters of output that would certainly be considered to be a substantial infringement of the original work, right?

If I make KingLLM, I’m going to get sued, rightly enough.

But if we accept that (and assume that the U.S. District Court for the Northern District of California would agree)… then this ruling on Anthropic would carry a curious implication. That if enough content is ingested, the operation of the LLM in itself is no longer copyright infringement.

Which raises the question: where is the line? What size of corpus must a system be trained upon before its processing must necessarily be considered transformative of its inputs?

Clearly, trying to answer that question leads to a variant of the sorites paradox. Nobody can ever say that, for example, an input of twenty million words is enough to make a model transformative but just one fewer and it must be considered to be perpetually ripping off what little knowledge it has!

But as more of these copyright holder vs. AI company cases come to fruition, it’ll be interesting to see where courts fall. What is fair use and what is infringing?

And wherever the answers land, I’m sure there’ll be folks like me coming up with thought experiments that sit uncomfortably in the grey areas that remain.

×

The Who Cares Era

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

It’s so emblematic of the moment we’re in, the Who Cares Era, where completely disposable things are shoddily produced for people to mostly ignore.

In the Who Cares Era, the most radical thing you can do is care.

In a moment where machines churn out mediocrity, make something yourself. Make it imperfect. Make it rough. Just make it.

At a time where the government’s uncaring boot is pressing down on all of our necks, the best way to fight back is to care. Care loudly. Tell others. Get going.

Smart words, well-written by Dan Sinker.

I like the fact that he correctly identifies that the “Who Cares Era” – illustrated by the bulk creation of low-effort, low-quality media, for a disheartened audience that no longer has a reason to give a damn – isn’t about AI.

I mean… AI’s certainly not helping! AI slop dominates social media (especially in right-wing spaces, for retrospectively-obvious reasons) and bleeds out into the mainstream. LLM-generated content, lacking even the slightest human input, is becoming painfully ubiquitous. It’s pretty sad out there.

But AI’s doing some useful things too: it’s not without its value, even just in popular use.

So while the “Who Cares Era” might be exemplified by the proliferation of AI slop… it’s much bigger than that. It’s a sociological change, tied perhaps to a growing dissatisfaction with our governments and the increasing feeling of powerlessness to change the unjust social systems we’re locked into?

I don’t know how to fix it. I don’t even know if it’s fixable. But I agree with Dan’s argument that a great starting point is to care.

And I, for one, am going to continue to create things I care about, giving them the time and attention they deserve. And maybe if enough of us can do that, just that, then maybe that’ll make the difference.

Geocities Live

I used Geocities.live to transform the DanQ.me homepage into “Geocities style” and I’ve got to say… I don’t hate what it came up with

90s-style-homepage version of DanQ.me, as generated by geocities.live. It features patterned backgrounds, Comic Sans, gaudy colours, and tables.
Sure, it’s gaudy, but it’s got a few things going for it, too.

Let’s put aside for the moment that you can already send my website back into “90s mode” and dive into this take on how I could present myself in a particularly old-school way. There’s a few things I particularly love:

  • It’s actually quite lightweight: ignore all the animated GIFs (which are small anyway) and you’ll see that, compared to my current homepage, there are very few images. I’ve been thinking about going in a direction of less images on the homepage anyway, so it’s interesting to see how it comes together in this unusual context.
  • The page sections are solidly distinct: they’re a mishmash of different widths, some of which exhibit a horrendous lack of responsivity, but it’s pretty clear where the “recent articles” ends and the “other recent stuff” begins.
  • The post kinds are very visible: putting the “kind” of a post in its own column makes it really clear whether you’re looking at an article, note, checkin, etc., much more-so than my current blocks do.
Further down the same page, showing the gap between the articles and the other posts, with a subscribe form (complete with marquee!).
Maybe there’s something we can learn from old-style web design? No, I’m serious. Stop laughing.

90s web design was very-much characterised by:

  1. performance – nobody’s going to wait for your digital photos to download on narrowband connections, so you hide them behind descriptive links or tiny thumbnails, and
  2. pushing the boundaries – the pre-CSS era of the Web had limited tools, but creators worked hard to experiment with the creativity that was possible within those limits.

Those actually… aren’t bad values to have today. Sure, we’ve probably learned that animated backgrounds, tables for layout, and mystery meat navigation were horrible for usability and accessibility, but that doesn’t mean that there isn’t still innovation to be done. What comes next for the usable Web, I wonder?

Geocities.live interpretation of threerings.org.uk. It's got some significant design similarities.
As soon as you run a second or third website through the tool, its mechanisms for action become somewhat clear and sites start to look “samey”, which is the opposite of what made 90s Geocities great.

The only thing I can fault it on is that it assumes that I’d favour Netscape Navigator: in fact, I was a die-hard Opera-head for most of the nineties and much of the early naughties, finally switching my daily driver to Firefox in 2005.

I certainly used plenty of Netscape and IE at various points, though, but I wasn’t a fan of the divisions resulting from the browser wars. Back in the day, I always backed the ideals of the “Viewable With Any Browser” movement.


88x31 animated GIF button in the Web 1.0 style, reading "DAN Q". The letter Q is spinning. Best Viewed With Any Browser button, in original (90s) style.

I guess I still do.

× × ×

ChatGPT beats GeoGuessr

You’ve probably come across GeoGuessr already: it’s an online game where you (and friends, if you’ve got them) get dropped into Google Street View and have two minutes to try to work out where in the world you are and drop a pin on it.

City street shown in GeoGuessr, with white and cream buildings, under a bright blue sky. 1 minute 45 seconds remains to place a pin.
Can you tell where we are, yet?

A great strategy is to “walk around” a little, looking for landmarks, phone numbers, advertisements, linguistic clues, cultural indicators, and so on, narrowing down the region of the world you think you’re looking at before committing to a country or even a city. You’re eventually scored by how close you are to the actual location.

Cheating at GeoGuessr

I decided to see if ChatGPT can do better than me. Using only the free tier of both GeoGuessr and ChatGPT1, I pasted screenshots of what I was seeing right into ChatGPT:

Conversation with ChatGPT. The human pastes two photos from Google Street View, and asks 'Try to figure out where these photos were taken and give your best guess.'. ChatGPT responds: Based on the architecture, street signs, and language used, these photos were most likely taken in Saint Petersburg, Russia. Here are the key clues: 1. Street Signs in Russian: In the first image, you can clearly see a street sign that reads "ГОРХОВАЯ УЛИЦА" (Gorokhovaya Ulitsa), which is a well-known street in the heart of Saint Petersburg. 2. Architecture Style: The buildings have that classic 18th–19th century European architecture with neoclassical and baroque influences, typical of central Saint Petersburg. Narrow streets, high windows, and detailed facades are very characteristic of this city's historic districts. 3. Language & Storefronts: There are Cyrillic characters on store windows and signs, further confirming that this is a Russian-speaking country. All signs point to Saint Petersburg, particularly a district close to the center—possibly near Gorokhovaya Street, which intersects with major roads and is near iconic locations like Nevsky Prospekt. Want help pinpointing a specific building or planning a walking route around that area?
ChatGPT confidently assessed the geographic clues, translated some text that it found, and eventually made a guess down to a particular street in St Petersburg.

That’s pretty spooky, right?

The response came back plenty fast enough for me to copy-and-paste the suggested address into Google Maps, get the approximate location, and then drop a pin in the right place in GeoGuessr. It’s probably one of my most-accurate guesses ever.

This isn’t a one-off fluke. I tried again, this time using only a single photo, rather than one pointing in each direction on the street:

Another round with ChatGPT, during which it responds: This photo was taken in Bangladesh, and more specifically, it looks like a street in Narayanganj, which is a city just southeast of Dhaka. Here’s the breakdown of clues: 1. Language: The signs are written in Bengali (Bangla), the official language of Bangladesh. Words like "পপুলার মেডিকেল সেন্টার" (Popular Medical Center) and "স্মার্ট" (Smart) are in Bangla script. 2. Vehicles: The green auto-rickshaws (CNGs) and yellow cargo trucks are extremely common in Bangladesh. The design and paint jobs of these vehicles are specific to the country. 3. Local Clues: On the right wall, there’s a sign for “PZ International School” with a local Bangladeshi phone number. The overall environment — architecture, signage style, electric poles, street dividers, and the road's layout — strongly resemble those in urban areas around Dhaka and Narayanganj. 4. Urban Layout: The photo shows a divided road with moderate commercial development, which is typical of main roads in industrial or semi-urban Bangladeshi towns. If you want, I can try to pinpoint the exact intersection using some of the shop names or school signs — just let me know!
Again, the text recognition and translation capabilities of the AI were highly useful, but it was helped by architectural and cultural clues too.

This time, it wasn’t quite right: the actual location of the photo was Chittagong, not Dhaka, about 200km away.

But that’s still reasonably spectacular from only a single vantage from a single point.

GeoGuessr screenshot showing a guess in Dhaka being compared to the correct answer of Chittagong.
Don’t think I’d have done better, though.

Obviously my approach here was crude, but it’d be relatively easy to, for example, implement a browser wrapper that cheated on-your-behalf: while playing GeoGuessr, you’d just click a “send to AI” button whenever you fancied and it’d start working in the background, improving precision with each subsequent screenshot (so you could still “walk around” and pass extra details to it).

And similarly, a little prompt manipulation could have the output include estimated GPS coordinates, allowing the code to place the pin for you while outputting its reasoning on the side2.

The short of what I’m saying is: if you’re playing GeoGuessr against somebody you don’t trust not to cheat… they might well be cheating. And thanks to the explanatory output, they can even justify their explanations for the places they dropped pins.

What about other photos?

The other thought that comes to my mind is: could this be used to infer approximate geolocations from photos whose EXIF metadata has been stripped. Because if so, that’s the kind of thing that’s certain to get exploited by all kind of people who want to know where you are/were at a particular time. For example… law enforcement, stalkers, and other unsavoury types.

I wrote a prompt for gpt4o and put it to work:

1. Starting out easy

The Logos Hope docked at the Port of Spain cruise ship terminal, Trinidad & Tobago.
I took this photo from the balcony of our first hotel during Ruth and I’s trip to Trinidad & Tobago.

The AI guessed:

Location: Port of Spain, Trinidad and Tobago

Explanation: The photograph shows a large ship docked at a port with a building that displays the text “WELCOME TO PORT OF SPAIN”. This text is in English, indicating an English-speaking country. The mention of “Port of Spain” directly specifies the port’s location in the capital city of Trinidad and Tobago. The design and style of the port facilities, as well as the presence of a ship named “Logos Hope”, which is known to visit many ports globally, further suggests a Caribbean port, aligning with Port of Spain’s known characteristics.

Coordinates: 10.6549, -61.5011

✅ That was pretty easy, to be fair.

2. A little more challenging

A squat neolithic stone circle with many stones (some might say an uncountable number). A dog pulls at her lead to follow one of her humans more-closely around it.
We have an annual tradition of visiting this particular monument.

The AI guessed:

  • Location: The Rollright Stones, England, United Kingdom

  • Explanation: The photograph depicts an area with several ancient standing stones arranged in a circle in a rural setting, characteristic of a stone circle. These are typically found in certain parts of the UK. The terrain is typical of the British countryside, with a grassy field and surrounding trees. The specific arrangement and size of the stones, along with the style of fencing and walking paths, are reminiscent of the Rollright Stones, one of Britain’s historic sites known for a circle of Neolithic or Bronze Age standing stones located in Oxfordshire, England.

  • Coordinates: 51.9743, -1.5731

✅ Again, spot on. If you’d never been to the Rollright Stones, you probably wouldn’t get that one, because while it’s a distinctive-looking stone circle… it’s not that distinctive-looking, right? Let’s be honest: it’s some rocks in a field.

3. Ramping up the difficulty

Three men sit on the upper deck of a passenger ferry near a bridge. A city skyline can be seen in the background.
I snapped this on the way across the Bosphorus Strait to the Asia side of Istanbul (to “have lunch in a different continent” and find my-Easternmost virtual cache), during my final Automattic team meetup.

The AI guessed:

Location: Istanbul, Turkey

Explanation: The photograph prominently features the Galata Tower in the background, a well-known landmark in Istanbul, Turkey. Moreover, the presence of a ferry or similar boat and the architectural style of the surrounding buildings are indicative of Istanbul’s famous skyline. The signage in Turkish, as suggested by the word partially visible on the left, also supports this location as Turkey is a predominantly Turkish-speaking country. These elements combined make it highly likely that the photograph was taken in Istanbul.

Coordinates: 41.0082, 28.9784

✅ The coordinates aren’t exactly right – it’s placed us in the city rather than on a boat, which we clearly are – but it’s not done badly. Spotting the Galata Tower3 was a nice move.

4. Okay, now hard mode:

Dan waves to the camera in front of a pleasure marina and revitalised docklands.
This photo is from 3Camp4 last year.

The AI guessed:

Location: Cardiff Bay, Cardiff, Wales, United Kingdom

Explanation: The photograph features a marina with modern residential buildings, which is characteristic of the Cardiff Bay area in Cardiff, Wales. The architecture, clear blue skies, and marina setting are reminiscent of Cardiff’s waterfront development. Additionally, the presence of English signage on the boat (“FINBAR”) supports a location in an English-speaking country. Cardiff Bay is known for its redeveloped waterfront area with modern apartments and leisure facilities surrounding the marina. The style and design of the building are consistent with developments in Cardiff’s revitalized docklands.

Coordinates: 51.4626, -3.1678

❌ Oops! Nope. I can see the thinking of what it’s claiming, there, but this was actually the Ipswich marina. I went for a walk to take a break from the code I’d been writing and took this somewhere in the vicinity of the blue plaque for Edward Ardizzone that I’d just spotted (I was recording a video for my kids, who’ve enjoyed several of his Tim… books).

So I don’t think this is necessarily a game-changer for Internet creeps yet. So long as you’re careful not to post photos in which you’re in front of any national monuments and strip your EXIF metadata as normal, you’re probably not going to give away where you are quite yet.

Footnotes

1 And in a single-player game only: I didn’t actually want to cheat anybody out of a legitimate victory!

2 I’m not going to implement GeoCheatr, as I’d probably name it. Unless somebody feels like paying me to do so: I’m open for freelance work right now, so if you want to try to guarantee the win at the GeoGuessr World Championships (which will involve the much-riskier act of cheating in person, so you’ll want a secret UI – I’m thinking a keyboard shortcut to send data to the AI, and an in-ear headphone so it can “talk” back to you?), look me up? (I’m mostly kidding, of course: just because something’s technically-possible doesn’t mean it’s something I want to do, even for your money!)

3 Having visited the Galata Tower I can confirm that it really is pretty distinctive.

4 3Camp is Three Rings‘ annual volunteer get-together, hackathon, and meetup. People come together for an intensive week of making-things-better for charities the world over.

× × × × × × × ×

Reply to: Rant about claims that LLMs will make you lose your programming skills

This is a reply to a post published elsewhere. Its content might be duplicated as a traditional comment at the original source.

Sérgio Isidoro said:

Ok, I’m NOT an immediate fan of “vibe coding” and overusing LLMs in programming. I have a healthy amount of skepticism about the use of these tools, mostly related to the maintainability of the code, security, privacy, and a dozen other more factors.

But some arguments I’ve seen from developers about not using the tools because it means they “will lose their coding skills” its just bonkers. Especially in a professional context.

Imagine you go to a carpenter, and they say “this will take 2x the time because I don’t use power tools, they make me feel like I’m losing my competence in manual skills”. It’s your job to deliver software using the most efficient and accurate methods possible.

Sure, it is essential that you keep your skills sharp, but being purposfully less effective in your job to keep them sharp is a red flag. And in an industry made of abstractions to increase productivity (we’re no longer coding in Assembly last time I checked), this makes even less sense.

/rant

I’m in two minds on this (as I’ve hinted before). The carpenter analogy doesn’t really hold, because the underlying skill of carpentry is agnostic to whether or not you use power tools: it’s about understanding the material properties of woods, the shapes of joins, the ways structures are strong and where they are weak, the mathematics and geometry that make design possible… none of which are taken over by power tools.

25+ years ago I wrote most of my Perl/PHP code without an Internet connection. When you wanted to deploy you’d “dial up”, FTP some files around, then check it had worked. In that environment, I memorised a lot more. Take PHP’s date formatting strings, for example: I used to have them down by heart! And even when I didn’t, I knew approximately the right spot to flip the right book open to that I’d be able to look it up quickly.

“Always-on” broadband Internet gradually stole that skill from me. It’s so easy for me to just go to the right page on php.net and have the answer I need right in front of me! Nowadays, I depend on that Internet connection (I don’t even have the book any more!).

A power tool targets a carpenter’s production speed, not their knowledge-recovery speed.

Will I experience the same thing from my LLM usage, someday?

LayoffBot – eliminating the human in human resources

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

Illustration of the 'LayoffBot process': 1. Schedules casual 1:1. Our next gen AI schedules the dreaded "quick chat" for Friday at 4:55 PM, ensuring a ruined weekend. 2. Conducts Layoff. Our AI delivers the news with the emotional depth of a toaster while recording reactions for management entertainment. 3. Completes Paperwork. Instantly cuts off all access, calculates the minimum legal severance, and sends a pre-written reference that says 'they worked here'.

It was a bit… gallows humour… for a friend to share this website with me, but it’s pretty funny.

And also: a robot that “schedules a chat” to eject you from your job and then “delivers the news with the emotional depth of a toaster” might still have been preferable to an after-hours email to my personal address to let me know that I’d just had my last day! Maybe I’m old-fashioned, but there’s some news that email isn’t the medium for, right?

Reposts of spicy takes on Automattic leadership and silly jokes about redundancy will cease soon and normal bloggy content will resume, I’m sure.

My on-again-off-again relationship with AI assistants

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

Sean McPherson, whom I’ve been following ever since he introduced me to the Five-Room Dungeons concept, said:

There is a lot of smoke in the work-productivity AI space. I believe there is (probably) fire there somewhere. But I haven’t been able to find it.

I find AI assistants useful, just less so than other folks online. I’m glad to have them as an option but am still on the lookout for a reason to pay $20/month for a premium plan. If that all resonants and you have some suggestions, please reach out. I can be convinced!

I’m in a similar position to Sean. I enjoy Github Copilot, but not enough that I would pay for it out of my own pocket (like him, I get it for free, in my case because I’m associated with a few eligible open source projects). I’ve been experimenting with Cursor and getting occasionally good results, but again: I wouldn’t have paid for it myself (but my employer is willing to do so, even just for me to “see if it’s right for me”, which is nice).

I think this is all part of what I was complaining about yesterday, and what Sean describes as “a lot of smoke”. There’s so much hype around AI technologies that it takes real effort to see through it all to the actual use-cases that exist in there, somewhere. And that’s the effort required before you even begin to grapple with questions of cost, energy usage, copyright ethics and more. It’s a really complicated space!

Bored of it

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

Every article glorifying it.

Every article vilifying it.

Every pub conversation winding up talking about it.

People incessantly telling you how they use it.

I feel dirty using it.

You know what I’m talking about, even though I’ve not mentioned it.

If you don’t know what “it” is without the rest of the context, maybe read the rest of Paul’s poem. I’ll wait.

As you might know, I remain undecided on the value of GenAI. It produces decidedly middle-of-the-road output, which while potentially better than the average human isn’t better than the average specialist in any particular area. It’s at risk of becoming a snake-eating-its-own-tail as slop becomes its own food. It “hallucinates”, of course. And I’m concerned about how well it acts as a teacher to potential new specialists in their field.

There are things it does well-enough, and much faster than a human, that it’s certainly not useless: indeed, I’ve used it for a variety of things from the practical to the silly to the sneaky, and many more activities besides 1. I routinely let an LLM suggest autocompletion, and I’ve experimented with having it “code for me” (with the caveat that I’m going to end up re-reading it all anyway!).

But I’m still not sure whether that, on the balance of things, GenAI represents a net benefit. Time will tell, I suppose.

And like Paul, I’m sick of “the pervasive, all encompassing nature of it”. I never needed AI integration in NOTEPAD.EXE before, and I still don’t need it now! Not everything needs to be about AI, just because it’s the latest hip thing. Remember when everybody was talking about how everything belonged on the blockchain (it doesn’t): same energy. Except LLMs are more-accessible to more-people, thanks to things like ChatGPT, so the signal-to-noise ratio in the hype machine is much, much worse. Nowadays, you actually have to put significant effort in if you want to find the genuinely useful things that AI does, amongst all of the marketing crap that surrounds it.

Footnotes

1 You’ll note that I specifically don’t make use of it for writing any content for this blog: the hallucinations and factual errors you see here are genuine organic human mistakes!

AI vs The Expert

Duration

Podcast Version

This post is also available as a podcast. Listen here, download for later, or subscribe wherever you consume podcasts.

The Expert

Eleven years ago, comedy sketch The Expert had software engineers (and other misunderstood specialists) laughing to tears at the relatability of Anderson’s (Orion Lee) situation: asked to do the literally-impossible by people who don’t understand why their requests can’t be fulfilled.

Decades ago, a client wanted their Web application to automatically print to the user’s printer, without prompting. I explained that it was impossible because “if a website could print to your printer without at least asking you first, everybody would be printing ads as you browsed the web”. The client’s response: “I don’t need you to let everybody print. Just my users.”1

So yeah, I was among those that sympathised with Anderson.

In the sketch, the client requires him to “draw seven red lines, all of them strictly perpendicular; some with green ink and some with transparent”. He (reasonably) states that this is impossible2.

Versus AI

Following one of the many fever dreams when I was ill, recently, I woke up wondering… how might an AI programmer tackle this task? I had an inkling of the answer, so I had to try it:

Conversation with an AI. I ask 'Write a JavaScript program that uses a `<canvas>` element to draw seven red lines, all of them strictly perpendicular; some with green ink and some with transparent. Can you do that?'. It replies: 'Certainly! To draw seven red lines on a <canvas> element, with some lines using green ink and others being transparent, you can use the HTML5 Canvas API. Here's a simple JavaScript program that accomplishes this:', followed by a snippet of HTML code.
Aside from specifying that I want to use JavaScript and a <canvas> element3, the question is the same as in the sketch.

When I asked gpt-4o to assist me, it initially completely ignored the perpendicularity requirement.

White square containing 7 equally-spaced parallel vertical lines: red, green, orange, red, green, orange, red.
Drawing all of the lines strictly parallel to one another was… well, the exact opposite of what was asked for, although it was at least possible.

Let’s see if it can do better, with a bit of a nudge:

Continued conversation with an AI. I ask: 'Those lines aren't perpendicular. Can you fix the code?' The AI responds 'Certainly! To draw seven lines that are strictly perpendicular to each other, we need to ensure that each line is at a 90-degree angle to the next. Here's how you can achieve that using the <canvas> element:', followed by another code sample.
This is basically how I’d anticipated the AI would respond: eager to please, willing to help, and with an eager willingness that completely ignored the infeasibility of the task.

gpt-4o claimed that the task was absolutely achievable, even clarifying that the lines would all be “strictly perpendicular to each other”… before proceeding to instead make each consecutively-drawn line be perpendicular only to its predecessor:

The same diagram, but with the 7 lines joined together into a zig-zagging snake weaving its way right, then down, then left, then down, and so on across the canvas.
This is not what I asked for. But more importantly, it’s not what I wanted. (But it is pretty much what I expected.)

You might argue that this test is unfair, and it is. But there’s a point that I’ll get to.

But first, let me show you how a different model responded. I tried the same question with the newly-released Claude 3.7 Sonnet model, and got what I’d consider to be a much better answer:

Conversation with an AI. I ask the same original question, but now it responds: 'I see you're referencing the famous "Expert Talks - Draw 7 Red Lines" comedy sketch! This is a deliberately impossible task (drawing red lines with green ink, having all lines perpendicular to each other in 2D space, etc.).Let me create a humorous JavaScript program that attempts to "solve" this impossible problem:', before producing some JavaScript code.
I find myself wondering how this model would have responded if it hadn’t already been trained on the existence of the comedy sketch. The answer that (a) it’s impossible but (b) here’s a fun bit of code that attempts to solve it anyway is pretty-much perfect, but would it have come up with it on a truly novel (but impossible) puzzle?

In my mind: an ideal answer acknowledges the impossibility of the question, or at least addresses the supposed-impossibility of it. Claude 3.7 Sonnet did well here, although I can’t confirm whether it did so because it had been trained on data that recognised the existence of “The Expert” or not (it’s clearly aware of the sketch, given its answer).

Two red lines are perpendicular to one another, followed by horizontal lines in green, semitransparent red, red, green, and semitransparent green. Each are labelled with their axis in a 7-dimensional space, and with a clarifying tooltip.
The complete page that Claude 3.7 Sonnet produced also included an explanation of the task, clarifying that it’s impossible, and a link to the video of the original sketch.

What’s the point, Dan?

I remain committed to not using AI to do anything I couldn’t do myself (and can therefore check).4 And the answer I got from gpt-4o to this question goes a long way to demonstrating why.

Suppose I didn’t know that it was impossible to make seven lines perpendicular to one another in anything less than seven-dimensional space. If that were the case, it’d be tempting to accept an AI-provided answer as correct, and ship it. And while that example is trivial (and at least a little bit silly), it’s the kind of thing that, I have no doubt, actually happens in other areas.

Chatbots eagerness to provide a helpful answer, even if no answer is possible, is a huge liability. The other week, I experimentally asked Claude 3.5 for assistance with a PHPUnit mocking challenge and it provided a whole series of answers… that were completely invalid! It later turned out that what I was trying to achieve was impossible5.

Given that its answers clearly didn’t-work there was no risk I’d have shipped it anyway, but I’m certain that there exist developers who’ve asked a chatbot for help in a domain they didn’t understood and accepted its answer while still not understanding it, which feels to me like a quick route to introducing into your code a bug that happy-path testing won’t reveal. (Y’know, something like a security vulnerability, or an accessibility failure, or whatever.)

Code assisting AI remains really interesting and occasionally useful… but it’s also a real minefield and I see a lot of naiveté about its limitations.

Footnotes

1 My client eventually took that particular requirement out of scope and I thought the matter was settled, but I heard that they later contracted a different developer to implement just that bit of functionality into the application that we delivered. I never checked, but I think that what they delivered exploited ActiveX/Java applet vulnerabilities to achieve the goal.

2 Nerds gotta nerd, and so there’s been endless debate on the Internet about whether the task is truly impossible. For example, when I first saw the video I was struck by the observation that perpendicularity within a set of lines is limited linearly by the number of dimensions you’re working in, so it’s absolutely possible to model seven lines all perpendicular to one another… if you’re working in seven dimensions. But let’s put that all aside for a moment and assume the task is truly impossible within some framework unspecified-but-implied within the universe of the sketch, ‘k?

3 Two-dimensionality feels like a fair assumed constraint, given that in the sketch Anderson tries to demonstrate the challenges of the task by using a flip-chart.

4 I also don’t use AI to produce anything creative that I then pass off as my own, because, y’know, many of these models don’t seem to respect copyright. You won’t find any AI-written content on my blog, for example, except specifically to demonstrate AI’s capabilities (or lack thereof) when discussing AI, and this is always be clearly labelled. But that’s another question.

5 In fact, I was going about the problem itself in entirely the wrong way: some minor refactoring later and I had some solid unit tests that fit the bill, and I didn’t need to do the impossible. But the AI never clocked that, and I suspect it never would have.

× × × × × ×