Ad Infinitum

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

For 25 years, Google Search was built on a contract. The web provided the content – billions of pages, freely linked, freely crawled. In return, Google sent people back. The link was the unit of exchange. It’s what made the Web thrive as an information system: you publish, Google indexes, users click through, and value flows back to the source. Win-win.

That contract is now broken. Generative UI doesn’t link to your article, necessarily. It absorbs your article, synthesizes it into a widget, and presents it as Google’s own answer. Information agents don’t send users to websites. They deliver “synthesized updates” with maybe a link or two buried at the bottom. The web was the scaffolding Google needed to build its index, to train its models, to accumulate the world’s information, and put ads next to it to get filthy rich. Now that the content is inside the system, the scaffolding is no longer needed. Google is creating its own context.

Google thinks it no longer needs the Web to deliver answers. And it no longer needs ad slots to deliver ads. What it needs is you. Your emails, your files, your calendar, your purchase history, your travel plans – all flowing into Spark, all building the richest possible picture of who you are and what you’re likely to click on. That’s exactly the kind of personal context those auction models need to work. The prediction module in the prominence allocation framework doesn’t run on keywords. It runs on knowing you.

An excellent piece by Matthias Ott, discussing revelations from this year’s Google I/O. In particular, the imminent pivot of Google Search from its lifelong “query in, list of links out” model to a wholesale “query in, LLM output out” one.

This isn’t just about putting AI output at the top of the search results, as I gather they do today, but about getting rid of search “results” entirely, and running everything through the model.

To which Matthias wisely asks: well, how will ads work then? Google’s business model is based on mining your personal data and shoving ads in your face. Where do they go in a search interface that it’s really a search but a “helpful” AI.

It turns out there’s a few approaches that Google seem to be considering, but what they’ve all got in common is the idea that marketers will be able to “influence” the LLM’s token generation, perhaps by using an LLM of their own to decide whether you (based on everything Google knows about you) are worth marketing to, and how much they’ll pay to do so, and then this input being “weighted” against competing advertisers and actual ingested data in order to feature advertiser-influenced content woven directly into the output of the LLM.

David Cross, as Arrested Development's Tobias Fünke, bites into a burger in a Burger King restaurant, with a Chicken Tendercrisp advertisement prominently displayed in the background.

Superficially, this sounds a little like product placement, like you sometimes see in American-made TV shows and movies. You know, where one character says, of “I’m going to go get a drink refill. You know you can get unlimited refills on any drink you want… and it’s free?”, and the next says “It’s a wonderful restaurant.”, while they’re sitting in Burger King.

Except this isn’t about saying “hey, people who watch this show are probably high and want a snack, let’s push our fast food their way”. It’s individualised.

It’s more like if the characters, knowing that your GMail account had a recent email about some test results, and your Google Calendar had an appointment tomorrow at the doctor, started talking about a particular brand of medication to, y’know, put the idea into your head.

Scene from Futurama, showing a display of Lightspeed Briefs with the slogan 'as seen in your dreams'.
The future presented in Futurama was supposed to be a joke, right?

We’re not at the point of completely-customised TV shows – nor the injection of commercials into dreams – yet. But Google’s plans, which blur the already-grey boundaries between organic and advertising content, are pretty insidious.

Assuming you’re in their ecosystem already, and possibly even if you’re not… Google may already be looking at your search terms, your calendar, your emails, your location and schedule, who you communicate with and how often, which web pages you visit, which apps you use, where you spend money, etc. (Seriously: if you somehow haven’t begun de-googling already, what are you waiting for?)… there’s a huge potential for misuse there.

But the arms race between people blocking or learning-to-ignore ads and advertisers trying to foist them upon us continues, and Google thinks this is an acceptable next step in escalating that. Using an insane amount of energy to recycle other people’s work without crediting them, in order to mash up the result with information they know about you in order to deliver you an unverifiable soup of words which might answer your question but with no clue how much or little commercial interest went into producing it, or by whom.

That’s some proper Darkest Timeline shit, right there.

You don’t need to take my nor Matthias’s word on it (although you should read his full post because it’s excellent): just look at the concept videos in Google’s blog post on the subject. You’ll also notice that almost-nowhere in their demos do Google even hint at the possibility of linking-out to anybody else’s website: there’s like one “visit site” button that appears at the very end of one of the flows, after the agent has done its things. Google is building a walled garden where they hope you’ll live, served by their AI butler on behalf of the companies who pay Google to tell you about their products.

Ugh.

× ×

AI is not a person

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

You didn’t “have a conversation” with ChatGPT.

It doesn’t “think you should…” It doesn’t think.

It didn’t “tell you that…” It doesn’t speak.

It doesn’t “feel that the best option is…” It doesn’t feel.

AI is a cheap parlor trick. You provide words, and it provides words back that are most likely to occur alongside the words you provided.

A useful reminder for the next time you’re tempted to personify or humanise an LLM.

LLMs are statistical tools. There are some things that the statistics of language can be good at, especially on average: stuff like summarisation, sentiment analysis, pattern identification, and checking for internal consistency.

But they’re just maths. They’re not a person.

It’s not even that they don’t care about you or don’t want to help you. They don’t even go that far: they’re incapable of “caring” or “wanting” in the first place. What they do is take all of the information they’ve ingested, plus their training and prompt, plus the conversation you’d had with them so far, plus a random number, and produce output which is, after a fashion, a prediction of what comes next.

As always: that’s not to say it’s useless. (It’s also not to say it’s always useful.) But as a tool, it’s pretty opaque to most normal people.

Unless you’ve really taken a deep-dive into understanding low LLMs work, they must seem like magic (hell; speaking as somebody who has taken such a deep-dive, they sometimes seem like magic!). I’m sure that some of the time, they must seem like they’re a living thing, or at least an approximation of one.

But they’re not. And it’s important to remember that.

So Unbelievable it Sounds Like you Googled It

“To Google”

When it first appeared, Google Search was a breath of fresh air. Simple, powerful search that Just Worked. It’s little wonder that the phase “to Google” something became synonymous with “to search for” something.

Somewhere,  Google lost its way.1 Perhaps the latest example of that is the injection of AI into every search2:

I’ve been to the cinema a few times lately so I’ve seen the Google AI ad that inspired me to make this parody… a lot.
Music by Dead Tubes Foundation (click to unmute/mute).

Apparently the kids these days don’t “Google it”. At least, not in their colloquialisms: they’re still probably using the search engine.

They say that they’ll “search it up”.

And this presents us with an opportunity:

Let’s reclaim the phrase “to Google”

I was inspired by a blog post by Mr Scribs (itself inspired by a Fediverse conversation), discovered via Bubbles:

We should turn the verb use of googling into an insult.

Example: “That’s so unbelievable it sounds like you googled it.”

I love this, and I’m absolutely going to start using it. “To Google” can absolutely transform from meaning “to search for, using a Web search engine” to meaning:

  • to seek knowledge in a lazy and convenient way, without regard for its accuracy
    (“I Googled from a guy at the pub that 5G caused Covid”)
  • to acquire information that can’t accurately be sourced or verified
    (“don’t quote me on that, though: I Googled it”)
  • to prefer an answer to a question that’s mildly more-convenient for the asker, even if getting it was ethically problematic
    (“pass me the jump leads, I’m going to Google one of the hostages”)

DeGoogling is so… 2010s. Let’s make the 2020s the decade where we redefine Google as a verb, in a way that better represents what it means to continue to buy in to the ever-increasingly toxic Google Search ecosystem.

Footnotes

1 Maybe it was then the Search-Chrome-Analytics trifecta that positioned the company as both the assistant to, and the adversary of, the users. Maybe it was when they dropped “don’t be evil”. Maybe it was when they stopped listening to users, or when they stopped listening to their own developers. Maybe it was when they helped sterilise the Web. Maybe it was AMP and they way they abused their monopoly to force it down everybody’s throats. Maybe it was when they killed (insert your favourite service here). Maybe it was when they started enshittifying Android. Make your own mind up.

2 Yes, I’m aware that some other search engines include AI summaries in results, too. But they all seem easier to turn off… and I’m yet to see a cinema advertisement about the fact that they do it for anything other that Google Search.

Hackers Simply Asked Meta AI to Give Them Access to High-Profile Instagram Accounts. It Worked

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

Hackers say that they used Meta’s AI support chatbot to break into a host of high-profile Instagram profiles by asking the support bot to change the email address associated with the target account. The claims coincide with a series of high-profile Instagram account takeovers, including the Barack Obama White House account, the Chief Master Sergeant of Space Force’s account, and Sephora’s account.

Well this is unsurprising and unshocking. Turns out that if you give your chatbot help interface unrestricted access to your backend systems – rather than, say, the access level of the human talking to it – then obviously hackers are going to try to jailbreak it in ways that you can’t possibly predict or guardrails against and, if/when they succeed, they’ll break into all the systems to which you’ve given the system access.

This shouldn’t even have to be said. Meta’s mistake here is so self-evident that they should be embarrassed.

Disabling AI in WordPress 7.0

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

Because I have access to wp-config.php, I added the following to my file:

define( 'WP_AI_SUPPORT', false );

A useful tip.

Personally, I’ve got what feels like an even-better approach (for me, at least) I switched to ClassicPress a year and a bit ago, and haven’t looked back! It’s a stripped-down fork of WordPress with no Gutenberg, lighter JavaScript, and a handful of other features… plus ClassicPress is already AI-free and staying that way.

This isn’t to say that you can’t use AI with ClassicPress. Just that you’re not having to install the feature if you’re never going to use it. With WordPress’s good plugin architecture it seems strange to me that such divisive features would become part of the core product, but that just seems to be the direction that the project’s been going in for a while now.

Is AI Profitable Yet?

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

Screenshot of a table and graph that shows all AI companies spending significantly more money than they make... except for NVidia, who're making bank.

No surprises here, but it’s interesting/staggering to see quite how large the disparity between spending and profit is for some of these companies.

I enjoy the fact that there’s a real-time ticker on the site so you can watch Amazon (for example) burn five thousand dollars a second.

When I tell people that generative AI, as it’s currently used, is unsustainable, this is what I’m talking about. Unless there’s a quantum leap in AI efficiency (for which I’ve seen no evidence of the feasibility) or a dramatic increase in the charged cost of LLM services (on the order of a tenfold increase assuming the increased cost does not drive any customers away; more if it does), this whole thing looks like a house of cards.

Bloomscrolling & Agentic Intelligence

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

A lot of the AI bubble – and that’s what it is, for all there are useful things inside there – is based on “Invest now, because when it works it’ll be fantastic!” rhetoric that’s like investing in a mainframe company in the late 60s on the basis that smartphones will take over the world. We’re moving a lot faster than mainframes went to PCs, but it’s important to invest in the things you can do with the system that work *now*.

There isn’t a good consumer use for AI right now. ChatGPT is a terrible source of information, confidently wrong in a way that sounds human enough to cause delusion and psychosis.

Things that AI/LLM tech is good for right now – pattern matching, repetitive tasks, logic flow – have some great business cases (It’s made some amazing breakthroughs in satellite and medical imagery, it’s got a bright future in automated transcription), and I think there’s a good case for it in content moderation (Yeah, it’s not great at it, but given the sick shit content mods on Facebook have had to deal with has given them cPTSD, I strongly believe it should be a machine job). It’s use for writing, music, translation, or art is still at the very least questionable and at the most utterly immoral.

Well-said, Aquarion!

The current generation of Generative AI isn’t useless. But its uses are quite specific and it certainly does more-harm-than-good that it’s promoted as an “everything” solution to every problem. I’ve used some form of agentic coding for several years, mostly of the “spicy autocomplete” variety1, and I mostly agree with Aquarion’s observations.

The whole post is an enjoyable tale.

Footnotes

1 My experiments with “vibe coding” have shown me that AI working alone can produce usually-functional code to specification, but that code is often of low quality and rarely maintainable, even by the AI.

Built In Obsolescence

This post contains and links to (clearly-identified) AI-generated content. As remains the case, none of my writing on this blog was generated by AI.

Imagine my excitement to learn that Pagan Wander Lu just dropped a new EP, Built In Obsolescence. And then imagine my horror to discover that it’s actually produced by P-AI-gan Wanderer Lu; an AI that’s been given PWL lyrics and some artistic direction.

Wot.

AI-generated EP cover of Built In Obsolescence by PAIgan Wanderer Lu, showing a neon digital outline of Andy.
The album art’s clearly also AI-generated, and that’s… well… you know. At least this robot hand has got the correct number of fingers.

Nothingness is what silicon dreams

My younger child’s been getting into PWL in a big way lately. As a result of this, I ended up making time for a careful re-listen to a lot of the back catalogue. This in turn inspired a blog post last year in which I mentioned that Checker Charlie‘s observations about humans replacing their work with machine effort feels increasingly prophetic in the age of generative AI. That’s something I didn’t see in it when I first reviewed it 13 years prior.

I’ve played with AI-generated music a couple of times myself, of course, mostly as an academic exercise. And it’s becoming more and more apparent that it’s hard to avoid bumping into it in the “real world”.

Early efforts at AI music were pretty unconvincing, always sounding a bit auto-tuney, frequently struggling to stress lines in the right places, and tripping over themselves when they try to do anything even remotely more-interesting than a simple repeating melody atop a predictable chord sequence. But they’re getting… shall we say… “better”, and there have been times nowadays when I’ve gotten some way through a track before realising that I’m listening to AI.

At least PWL’s being honest about it and declaring at the outset that this is AI-generated art. There’s plenty of folks using AI to generate content online and not declaring it, which is pretty awful1. Anyway: in this EP the AI’s moderately well-concealed and listening casually to most of the tracks I wouldn’t have noticed it if I hadn’t been told2.

Is there life enough in these chords?

So I listened to the EP. Three times.

The cover of Checker Charlie, I’m sad to admit, works. It’s got the feel of early-nineties pop, full of synths and saccharine, but instead of insipid lyrics about love it benefits a lot from Andy’s lyrical prowess. It’s a bouncy bop that would be forgettable if it weren’t for the excellent story told by the words is, I suppose, what I mean to say. And, of course, it’s the song that would have made me think about this. Anyway: I enjoyed it and would absolutely listen to it again, and I don’t know what that says about me, about the song, or anything else.

Uncanny Valley doesn’t work as well. Musically, it feels like a new artist in 2012 drew inspiration from their dad’s new wave albums but wanted to make it sound more like Carly Rae Jepsen was collabing with Daft Punk. And the result is kind-of…flat? Could I even say… soulless? It feels like it might have been the B-side of their cover of Chemicals Like You, which rolls out next in the same vein. Twice was probably enough for these two.

Repetition 4 is among my favourite – let’s say top 15? – Pagan Wanderer Lu songs and the AI’s cover of it starts so strong. It finishes pretty strong too. The voice it’s chosen shows only a hint of uncanny-valley-autotune and it wails plaintively. The most human-made bits – the lyrical themes of fighting for creativity against your own struggles as a vulnerable and flawed human “machine” – remain solid. I really expected to love this one! But by the time we were half way through the song it felt… musically-repetitive. You know when you get a pop cover of a classic song sometimes3 and you feel like the cover artist… missed the point somehow? That’s what this feels like to me.

The repetitions of “we are all machines… for dancing” in the original felt meaningful and real; a human’s cathartic resignation to pleasure in the simple things we all enjoy, despite the challenges of life… but the AI cover adds this kind of doo-woppy backing vocals that subtract, rather than adding to, the meaning. I’m not saying it ruins it – it’s still a fun and bouncy version of a great song… but it’s one of those covers that leaves you longing for the original.

And then there’s the “unaligned version” of Uncanny Valley. I’m not sure if the introduced distortions in this version are AI-generated or not. They don’t feel like the kinds of “creative” choices that any AI I’ve played with would make, so I suspect this represents a closer human intervention in the AI’s process: humans imitating machines imitating humans, perhaps? Anyway: the change doesn’t add anything for me.

Had this been produced entirely by a human, I’d say that EP consists one one track I’d add to my everyday playlist (the cover of Checker Charlie), maybe one or two tracks that I “wouldn’t necessarily skip” if they came up on a random shuffle while I wad driving… and the rest just feels too much like “bad cover” vibes.

And that’s as much of a review as I’m willing to give, for the reasons touched-upon below.

Building the engines of our own defeat

I continue to have several issues with the widespread use of generative AI, and in particular I have problems with it being used in the production of art. Those are partially mitigated by it being used by an artist to remix their own work, and partially mitigated by the transparent declaration of the use of AI by the publisher both of which are true in this case. But many issues (ethical, environmental, etc.) still remain.

Perhaps the biggest of which in this case is my concern that we’re using automation wrong.

As a child, I was optimistic about a future in which machines would take away the boring and repetitive work that humans do, leaving us free to pivot to experimental and experiential roles: the joy of working hard in the quest of discovery and of creativity. But instead, the predominant popular use of generative AI is to replace exactly those things, leaving humans only with an increasing amount of drudgery, review, and fact-checking. Where did we go wrong?

Don’t get me wrong: I love that Pagan Wanderer Lu has created this EP. Taking art that he’s created, whose concept touches on the concepts of AI… and feeding them into an actual AI for reinterpretation is transformative. It’s worthy of discussion as a piece of art in its own right. And the result is… well, some of it’s good, and other bits are okay.

What I don’t like is what it represents: the wider societal issue of the mainstream use of these technologies that have enormous unsolved problems.

So I guess… I appreciate the cognitive dissonance of enjoying a peice of music and disliking what it means?

Footnotes

1 Whether or not the side-effect of undisclosed AI-generated content “poisoning the well” for future AI training is a good or bad thing remains an open question, in my mind, but it’s certainly a real phenomenon. You know how we salvage the wrecks of ships sunk before the atomic age because they’re untainted by man-made radioactivity, which makes them useful for special purposes? It feels like the Internet before the explosion in generative AI may provide a similar cultural resource for future AI training, if you see what I mean.

2 And assuming I wasn’t already familiar with the artist, who doesn’t usually sound like an auto-tuned female singer.

3 I don’t have a specific example so I hope this is a universal experience!

Coding Is When We’re Least Productive

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

I potentially saved my client a bunch of money and embarrassment with that 3-line change.

Now, I consider that a productive day.

But had I been measured on my contribution by lines of code, or commits, or features finished, it would have been seen as a very unproductive day by my manager.

A great anecdote and some wise words from Jason Gorman on the nature of productivity and code.

This matches my feeling on AI. It’s good at making lots of code. Sometimes it even writes the right code. But something it rarely demonstrates skill at is comprehending the bigger issue. I’m sure we’re already seeing developers who “game” their employers’ productivity metrics, to the detriment of the end users, by having AI make “more” code without having to engage their brain and actually understand the problem.

(And, of course, there are employers who, whether intentionally or not, promote this kind of behaviour through their policies and success metrics.)

NHS England rushes to hide software over AI hacking fears

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

NHS England has issued new guidance to staff, which has been shared with New Scientist, that demands existing and future software be pulled from public view and kept behind closed doors. “All source code repositories must be private by default. Repositories must not be public unless there is an explicit and exceptional need, and public access has been formally approved,” says the new guidance. The deadline for making code private is 11 May.

Last month, an AI created by Anthropic called Mythos was widely reported to be capable of discovering flaws in virtually any software, potentially allowing hackers to break into systems running it.

NHS England’s guidance specifically points to Mythos as the cause for the new measures.

Yet again, “AI” is the reason why we can’t have nice things on an open and transparent Web.

This is bad, of course. But the worst part is the illusion it helps feed that closed-source software is necessarily more-secure than open-source software. Obviously it’s all much more-complex than that. Indeed, the article goes on to quote Terence Eden thoroughly debunking the entire line of thought:

“Is it possible that Mythos will scan a repository and find a bug? Yes, 100 per cent likely. Is that going to be a bug that causes a security issue in a live NHS service somewhere? Almost certainly not,” says Eden. “I think it’s someone in NHS England buying into the hype that Mythos is going to cause the end of security as we know it and getting a bit panicked.”

He’s right. This policy change is unlikely to improve the security of any of the affected pieces of NHS software (for much of which, the code is already out-there and archived, and so removing it from the Internet now is pretty pointless). If it’s going to be attacked, it’ll be attacked, and the resources that the bad guys have for probing a whole database worth of CVEs or fuzz-testing the extremities makes the availability of vulnerability-scanning AI pretty-close to irrelevant.

At least if it were open source then the good guys would have a chance of helping out… as well as we, the taxpayers who made the software possible, being able to see where our money was going!

Altogether a bad move by the NHS, here.

rejecting convenience

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

why bother going to the brick-and-mortar store? amazon is more “convenient”. why bother cooking a nice meal for yourself? doordash and uber eats are more “convenient”. why go out and socialize with people? facebook is more “convenient”. why use a digital camera, camcorder, or polaroid? your smartphone is more “convenient”. why bother going to the theater or concerts? netflix and spotify are more “convenient”. why bother making art? asking an AI to generate it for you is more “convenient”.

well, i say nuts to that. from now on, i’m going to make my life as inconvenient as possible. i’m going to go to the store and buy stuff in person. i’m going to make my own food with my own hands. i’m going to socialize with people face-to-face. i’m going to use a true camera instead of my phone’s camera. i’m going to buy blu-rays, DVDs, and CDs instead of streaming. i’m going to take my time when creating, watching, playing, and reading a work of art.

I’m seeing an growing movement in indieweb, revivalist, and adjacent circles that express RNotté’s sentiment: that the endless (and highly-marketable) quest for increased convenience in our lives has gained us free time, but we’ve lost something along the way.

What we’ve lost varies from case to case, but includes freedom (from lock-in to subscription services), creative satisfaction (from convenient “artistic” expression), privacy (from becoming the product, packaged-up by big-data advertising-funded tools), and social interactions (from so much of “social” media).

But reading RNotté share their thoughts on the matter today was the first time that it’s reminded me of The Matrix.

Framegrab from The Matrix. In the foreground is the silhouette of Morpheus, who is about to be interrogated by Agent Smith, a man in a suit at the windowed far end of an office.
The connection was probably helped by the fact that I rewatched the film pretty recently.

There’s a bit where Agent Smith says, to his captive the rebel captain Morpheus:

Did you know that the first Matrix was designed to be a perfect human world? Where none suffered, where everyone would be happy. It was a disaster. No one would accept the program. Entire crops were lost. Some believed we lacked the programming language to describe your perfect world.

Smith goes on to elucidate that his personal explanation for this fault was that humans depend upon suffering and misery, while acknowledging that there are other explanations. And perhaps we’ve touched upon one.

Perhaps humans – all humans – have a limit for how much they’re willing to accept convenience as compensation. Connected humans in The Matrix grain a convenient life, superficially superior to the struggle for survival experienced by humans living in the real world, short on food and hunted by machines. But to get that, they trade away their individual ability to become aware of the truth and, collectively, the ability for humanity for shape its own destiny. But there’s something about the imbalance of power in the arrangement niggles in human minds, and some rebel against the established order… and are joined by others who are shown that an alternative is available.

Clearly – as RNotté and others show – faceless technological forces need not go quite so far as enslaving an entire species before “convenience” no longer becomes a tolerable mitigation!

I’m not convinced that seeking out inconvenience is in itself a good. But questioning what your conveniences are worth and what you’re paying for them… that’s definitely worthwhile.

×

Once You’re Asking the Right Question, You Don’t Need To Ask!

Folks at work have been encouraging to make more use of generative AI in my workflow1; going beyond my current “fancy autocomplete” use and giving my agents more autonomy. My experience of such “vibe coding” so far has been… mixed2, but I promised I’d revisit it.

One thing that these models are usually effective at is summarisation3. This is valuable if you’re faced with a large and unfamiliar codebase and you’re looking to trace a particular thing but you’re not certain where it is or what it’ll be called. While they’re not always fast, these tools can at least work in the background, which allows the developer to get on with something else while the agent trawls logs, code, and configuration to find and explain a fuzzily-defined thing.

Recently, I had a moment which I thought might be such an instance… but it didn’t turn out quite the way I expected. Here’s the story4:

The broken dev env

I’d been drafted into an established and ongoing project to provide more hands, following a coworker’s departure last week. This project touches parts of our (sprawling, microsevices-based) infrastructure that I hadn’t looked at before, so there was a lot I didn’t yet know.

I picked an issue that had belonged to my former colleague that QA had rejected and set out to retrace their steps: to replicate the problem that the QA engineers had identified and in doing so learn more about the underlying process.  I spun up my development environment and tried to follow the steps.

A popup error message saying "Oops, something went wrong. Please try again."
The process failed… but much earlier than QA had said it would. Clearly my development environment was at fault, or at least not representative of their setup.

But I couldn’t even get as far as their problem before my frontend barfed out an error message. Sigh! Probably there’s some configuration I’ve missed somewhere in the myriad microservices, or else the data I’m testing with isn’t a fair reflection on what they’re doing as-standard.

Following some staff changes, I have no teammates on this side of the Atlantic who could help me decipher this: a “quick question on Slack” wouldn’t solve this one until hours from now. It was time to start debugging!

But… maybe Claude could help? It’s got access to almost all the same code, logs, tools and browser windows I do. I started typing:

✨ What’s up next, Dan?

In my development environment for https://service.dev/asset/new, when I click “Save”, I see the error “Oops, something went wrong.” Why?

Context is key

It’s quite possible that Claude would have gone away, had a “think”, done some tests, and then come back to me with a believable answer. It might even have been correct, and I’d have been able to short-cut my way back to productivity (and I’d have time to make a mug of coffee and finish reading my emails while it did so). Then, I’d just have to check that it was right, make the change, and get on with things.

But I realised that it’d probably work faster (and cheaper, and using less energy) if it had slightly more context from the get-go, so I elaborated. The first thing I’d want to know if I were debugging this is what was actually happening behind the scenes. I dipped into my browser’s Network debugger and extracted the relevant output, adding it to my prompt:

✨ What’s up next, Dan?

In my development environment for https://service.dev/asset/new, when I click “Save”, I see the error “Oops, something went wrong.” Why?The payload POSTed to the server is { content: 'test1', audience: [ 'one' ], status: 'draft' } and the response is a HTTP 500 with the following stack trace: pasted 94 lines

That’s more like it, now I could let it get on with its work. But wait…

Rubberducking

There’s a concept in computer programming called “rubberducking”. The name comes from an anecdote in The Pragmatic Programmer about a developer who, when stuck on a problem, would explain the code line-by-line to a rubber duck. The thinking is that talking-through a problem, even to someone (or something) who doesn’t understand it, can lead the speaker to insights they were otherwise missing.

I’ve done it myself many, many times: recruiting a convenient colleague or friend and talking them through the technical problem I was faced with, and inviting them to ask me to go into greater detail if I seemed to be skimming over anything, and I can promise that it can work.

A witch is happy and proud of her invention - a rubber duck. She explains to her friend: I just figured that formulating my questions out loud helps me to solve them, and finally that's all I needed.
I discovered Mini Fantasy Theater recently and loved this episode from its backlog.

The panel above is part of a series in which a sorceress called Cepper who’s coerced by her university into using Avian Intelligence (“AI”) – a robotic parrot5 that her headmaster insists is the future of magic. She experiments with it, finds it occasionally useful but more-often frustrating, attempts to implement her own local version but find that troublesome in different ways, and eventually settles on using an inanimate rubber duck instead. I get it, Cepper!

Let’s put that distraction aside for a moment and get back to the story of my broken development environment.

Clues in the stack trace

The top entry in the stack trace was an unsuccessful call to a different microservice, so I figured I’d pull its logs too, in order to further help direct the AI in the right direction6:

✨ What’s up next, Dan?

In my development environment for https://service.dev/asset/new, when I click “Save”, I see the error “Oops, something went wrong.” Why?The payload POSTed to the server is { content: 'test1', audience: [ 'one' ], status: 'draft' } and the response is a HTTP 500 with the following stack trace: pasted 94 linesThe stack trace suggests that a call is being made to the dojo backend service, where the following error log looks relevant: pasted 9 lines

I haven’t tried it, but I’m pretty confident that the LLM, after much number-crunching and a little warming-up of some datacentre somewhere, would get to the answer. But again, I found something niggling inside me: the second-from top line in the dojo logs suggested that a connection was being made to a further, deeper microservice.

I should pull its logs too, I figured.

The final puzzle piece

As an aide mémoire – in a way I’ve taken to doing when taking notes or when talking to AI – I first typed what I was going to provide. This is useful if, for example, somebody distracts me at a key moment: it means you’ve got a jumping-off point predefined by my past self:

✨ What’s up next, Dan?

In my development environment for https://service.dev/asset/new, when I click “Save”, I see the error “Oops, something went wrong.” Why?The payload POSTed to the server is { content: 'test1', audience: [ 'one' ], status: 'draft' } and the response is a HTTP 500 with the following stack trace: pasted 94 linesThe stack trace suggests that a call is being made to the dojo backend service, where the following error log looks relevant: pasted 9 lines. It’s calling osiris, which says:

I dipped into the directory for

osiris , and before I even got to the logs I spotted a problem: that microservice was on an old feature branch. How odd! I switched to the main branch and… everything started working.

The entire event took only a few minutes. I’d find some information, type it into Claude’s input field, realise that more information could be valuable, and repeat.

By the time I’d finished describing the problem, I’d discovered the solution. That’s the essence of successful rubberducking. I didn’t need the AI at all. All I needed was the illusion of something that might be able to help if I just talked through what I was thinking.

I don’t know what the moral is, here.

I wonder if I’d have been as effective had I just typed into my text editor. I suppose I would have, but I wonder if I’d have been motivated to do so in the first place? I’ve tried rubberducking before by talking to an imaginary person, but I’ve never tried typing to one7; maybe I should start?

Footnotes

1 I’m pretty sure every engineering department nowadays has it’s rabid fanboys, but I’m pleased that for the most part my colleagues take a more-pragmatic and realistic outlook: balancing the potential benefits of LLM-assisted coding with its many shortfalls, downsides, and risks.

2 My experience of vibe-coding in a nutshell: LLMs are great at knocking out the easy 80% of any engineering problem, but often in a way that makes the remaining 20% – already the hard part – harder than it would have been if a human had done the first 80% (especially if it’s the same human and they can bring their learnings with them)… and I’m definitely not the only one who’s found that. I also suspect that the unsatisfying and unimproving task of shepherding a flock of agents to write code and then casually reviewing it is not significantly more-productive (which research backs up) and results in a significantly increased regression rate… but I’m ready to be proven wrong when more studies come out. In short: I continue to think that GenAI isn’t useless, but neither is it necessarily always worthwhile.

3 So long as what you’ve got them summarising is something you can later verify!

4 I’ve taken huge liberties with the strict factual accuracy to make this more-readable as well as to to not-expose things I probably oughtn’t. So before you swoop in to criticise my prompt-fu (not that I asked you, but I know there’s somebody out there who’s thinking about doing this right now), please note that none of the text in this page are what I actually wrote to the AI; it’s a figurative example.

5 A literal stochastic parrot, one might say!

6 I’d had an experience just the previous week in which it’d gone off on completely the wrong track, attempting to change code in order to “fix” what was ultimately a configuration or data problem, and so I thought it might be useful to give it some rails to follow, to start with.

7 Except insofar as this AI agent is an “imaginary person”, which it possibly already a step-too-far in implying personhood for my liking!

×

The machines are fine. I’m worried about us.

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

Unlike Alice, who spent the year reading papers with a pencil in hand, scribbling notes in the margins, getting confused, re-reading, looking things up, and slowly assembling a working understanding of her corner of the field, Bob has been using an AI agent. When his supervisor sent him a paper to read, Bob asked the agent to summarize it. When he needed to understand a new statistical method, he asked the agent to explain it. When his Python code broke, the agent debugged it. When the agent’s fix introduced a new bug, it debugged that too. When it came time to write the paper, the agent wrote it. Bob’s weekly updates to his supervisor were indistinguishable from Alice’s. The questions were similar. The progress was similar. The trajectory, from the outside, was identical.

Here’s where it gets interesting. If you are an administrator, a funding body, a hiring committee, or a metrics-obsessed department head, Alice and Bob had the same year. One paper each. One set of minor revisions each. One solid contribution to the literature each. By every quantitative measure that the modern academy uses to assess the worth of a scientist, they are interchangeable. We have built an entire evaluation system around counting things that can be counted, and it turns out that what actually matters is the one thing that can’t be.

The strange thing is that we already know this. We have always known this. Every physics textbook ever written comes with exercises at the end of each chapter, and every physics professor who has ever stood in front of a lecture hall has said the same thing: you cannot learn physics by watching someone else do it. You have to pick up the pencil. You have to attempt the problem. You have to get it wrong, sit with the wrongness, and figure out where your reasoning broke. Reading the solution manual and nodding along feels like understanding. It is not understanding. Every student who has tried to coast through a problem set by reading the solutions and then bombed the exam knows this in their bones. We have centuries of accumulated pedagogical wisdom telling us that the attempt, including the failed attempt, is where the learning lives. And yet, somehow, when it comes to AI agents, we’ve collectively decided that maybe this time it’s different. That maybe nodding at Claude’s output is a substitute for doing the calculation yourself. It isn’t. We knew that before LLMs existed. We seem to have forgotten it the moment they became convenient.

Centuries of pedagogy, defeated by a chat window.

This piece by Minas Karamanis is excellent throughout, and if you’ve got the time to read it then you should. He’s a physics postdoc, and this post comes from his experience in his own field, but I feel that the concerns he raises are more-widely valid, too.

In my field – of software engineering – I have similar concerns.

Let’s accept for a moment that an LLM significantly improves the useful output of a senior software engineer (which is very-definitely disputed, especially for the “10x” level of claims we often hear, but let’s just take it as-read for now). I’ve experimented with LLM-supported development for years, in various capacities, and it certainly sometimes feels like they do (although it sometimes also feels like they have the opposite effect!). But if it’s true, then yes: an experienced senior software engineer could conceivably increase their work performance by shepherding a flock of agents through a variety of development tasks, “supervising” them and checking their work, getting them back on-course when they make mistakes, approving or rejecting their output, and stepping in to manually fix things where the machines fail.

In this role, the engineer acts more like an engineering team lead, bringing their broad domain experience to maximise the output of those they manage. Except who they manage is… AI.

Again, let’s just accept all of the above for the sake of argument. If that’s all true… how do we make new senior developers?

Junior developers can use LLMs too. And those LLMs will make mistakes that the junior developer won’t catch, because the kinds of mistakes LLMs make are often hard to spot and require significant experience to identify. But if they’re encouraged to use LLMs rather than making mistakes by hand and learning from them – to keep up, for example, or to meet corporate policies – then these juniors will never gain the essential experience they’ll one day need. They’ll be disenfranchised of the opportunity to grow and learn.

It’s yet to be proven that more-sophisticated models will “solve” this problem, but my understanding is that issues like hallucination are fundamentally unsolvable: you might get fewer hallucinations in a better model, but that just means that those hallucinations that slip through will be better-concealed and even harder to identify in code review or happy-path testing.

Maybe – maybe – the trajectory of GPTs is infinite, and they’ll keep getting “smarter” to the point at which this doesn’t matter: programming genuinely will become a natural language exercise, and nobody will need to write or understand code at all. In this possible reality, the LLMs will eventually develop entire new programming languages to best support their work, and humans will simply express ideas and provide feedback on the outputs. But I’m very sceptical of that prediction: it’s my belief that the mechanisms by which LLMs work has a fundamental ceiling – a capped level of sophistication that can be approached but never exceeded. And sure, maybe some other, different approach to AI might not have this limitation, but if so then we haven’t invented it yet.

Which suggests that we will always need experienced engineers to shepherd our AIs. Which brings us back to the fundamental question: if everybody uses AI to code, how do we make new senior developers?

I have other concerns about AI too, of course, some of which I’ve written about. But this one’s top-of-mind today, thanks to Minas’ excellent article. Go read it to learn more about how physics research faces a similar threat… and, perhaps, consider how your own field might need to face this particular challenge.

People are not friction

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

The Gell-Mann Amnesia Effect of AI is a pretty well documented phenomenon:

The Gell-Mann amnesia effect is a cognitive bias describing the tendency of individuals to critically assess media reports in a domain they are knowledgeable about, yet continue to trust reporting in other areas despite recognizing similar potential inaccuracies.

Summarizing, AI sounds like a incredible genius synthesizing the world’s knowledge right up until you ask it about the thing you know about, then it’s an idiot. Even knowing about this phenomenon and having experienced it countless times, LLMs have an intoxicating quality to them.

I remember one time, maybe in the mid-1990s, when I saw a shopping channel (remember those? oh god, they’re still a thing, aren’t they?) where the host was trying to sell a personal computer. And… clearly, they knew absolutely nothing about it. They kept hitting on the same two or three talking points they’d been given (“mention the quad-speed CD-ROM drive!”) and fumbling their way through, and it gave me a revelation:

knew enough about computers that I could see that the presenter was bullshitting their way through the segment. But there are plenty of things that I don’t know much about, which are also sold on this same show. Duvets, jewellery, glassware… I’m nowhere near as much an expert on these as I was on PC featuresets. Is there something inherently incomprehensible about computers? No. So it’s reasonable to assume that these salespeople probably know equally-little about everything they sell, it’s just that I don’t have the knowledge base to be able to see that.

That’s what GenAI often feels like, to me. Having collated all of the publicly-available knowledge it could find into its model doesn’t make it smarter than the smartest humans, it brings it towards probably something slightly-above-the-average in any given subject, depending on the topic. If I ask an LLM about something that I don’t understand well, it produces often highly-believable answers, but if I ask it about something that I’m an expert in, it can come off as a fool.

I’m very interested in how we teach information literacy in this new world of rapidly-generated highly-believable nonsense.

Anyway: Dave’s post doesn’t go in that direction – instead, he’s got some clever thoughts about how the “convenience” of a “good enough” AI-driven solution to any given problem risks us seeing humans as the friction point, which ultimately works against those very humans who are looking to benefit from the technology:

We need experts to share what they know and improve the quality of our work, generated or otherwise. We even need idiots to make sure we can break ideas down into their simplest form that everyone, agents or human, understand. People can have bad attitudes, be shitty, and have wrong opinions… but people are not friction. An LLM may be able to autocorrect its way into a plausible human response, but it’s not people. It doesn’t care if it’s right or wrong.

It’s an easy and worthwhile read.

Reply to: I’m OK being left behind, thanks!

This is a reply to a post published elsewhere. Its content might be duplicated as a traditional comment at the original source.

Terence Eden said:

Many years ago, someone tried to get me into cryptocurrencies. “They’re the future of money!” they said. I replied saying that I’d rather wait until they were more useful, less volatile, easier to use, and utterly reliable.

“You don’t want to get left behind, do you?” They countered.

That struck me as a bizarre sentiment. What is there to be left behind from? If BitCoin (or whatever) is going to liberate us all from economic drudgery, what’s the point of “getting in early”? It’ll still be there tomorrow and I can join the journey whenever it is sensible for me.

100%. If I “get in early” on something, it’s because that thing interests me, not because I’m betting on its future. With a hundred new ideas a day and only one of them “making it”, it’s a fools’ game to try to jump on board every bandwagon that comes along.

With cryptocurrencies, though, I’m fortunate enough to have an even better comeback at the cryptobros that try to shill me whatever made-up currency they’re “investing” in today: I’ve already done better than they ever will, at them.

When Bitcoin first appeared, I took a technical interest in it. I genuinely never anticipated it’d take off (I made the same incorrect guess with MP3s, too!), but I thought it was a fun concept to play about with. The only Bitcoins I ever paid for must’ve been worth an average of 50p each, or so.

I sold my entire wallet of Bitcoins when they hit around £750 each. I know a tulip economy when I see one, I thought. Plus: I was no longer interested in blockchains now I was seeing how they were actually being used: my interest had been entirely in the technology and its applications, not in the actual idea of a currency!

Sure, I kick myself ocassionally, given that I later saw the value rise to tens of thousands of pounds each. But hey, I was never in it for the money anyway.

So yeah, I tell cryptobros; I already made a 1500% ROI on cryptocurrency. And no, I’m not buying any cryptocurrencies any more. Whatever they think “getting in early” was, they’re wrong, because I was there years ahead of them and I wasn’t even doing it to “get in early”; I did it because it was interesting. And honestly, isn’t that a better story to be able to tell?

I feel the same way about the current crop of AI tools. I’ve tried a bunch of them. Some are good. Most are a bit shit. Few are useful to me as they are now.

If this tech is as amazing as you say it is, I’ll be able to pick it up and become productive on a timescale of my choosing not yours.

Yup, that’s the attitude I’m taking.

I play with new AI technologies, sometimes. I don’t do it because I’m afraid of being left behind because – as you say – if a technology is transformative, we’ll all get to catch up eventually.

Do you think that people who had smartphones first are benefitting today because they “got in early” on something that later became mainstream?

Of course they’re not. Their experience is eventually exactly the same as everybody else’s, just like it was for everybody who “got in early” on hype trains whose final station came early, like Compuserve GO-words, WAP, Beenz.com, WebTV, the CueCat, m-Commerce, HD-DVD, the JooJoo, or Google+.