On AI, Rabbit Holes, and Rubber Ducks

March 19, 2025

#blogs
#llm

A yellow rubber ducky. — What do we lose when we use LLM coding tools?

As someone who doesn’t use AI coding tools, I’m always interested in hearing about other people’s experiences using them. My instinct is that using LLMs might get you a result you like, but it doesn’t give you the same understanding that working on the problem yourself would.

I recently read Simon Willison’s blog post about using AI to experiment with the HTML <dialog> element, and learning about his process was illuminating.

Before I get into it though, I’d first like to state my biases:

I don’t like LLMs and think their use is ultimately unethical
I think Willison is a great programmer, and I respect his work and his efforts to educate people on his blog.

These points may seem in conflict. And in some ways they are – I’d respect Willison even more if he talked more frequently about the harms of AI:

the harm to the environment
the harm to the workers who do RLHF training and content moderation
the harm to our information commons as it is replaced by slop
the harm done by facists as they bend LLMs to their will
and on and on

But still, at this point, the Pandora’s Box is open, with countless people using LLMs everyday. I know a shocking number of people who reach for ChatGPT to answer every question that comes to mind.

And so even as I think we should be fighting against LLMs, getting a clear-eyed view of their capabilities and limitations is useful.

Willison’s blog is very helpful in that regard. He documents how he uses AI tools for coding, and shares tips on how to get better results out of them. He makes it clear that AI tools only augment people’s capabilities – they’re not going to replace coders on merits any time soon.

Which is a point that could maybe be used to sway bosses that want to fire their employees because of the hype instead of reality…

So, with all that said, on to the blog post!

In his post, Willison is using Claude to build a prototype website to explore the HTML <dialog>element.

In the prototype, the dialog doesn’t take fill up the whole screen’s height, there’s a gap at the bottom, and he’s confused about why. He tries to figure it out where the gap is coming from by asking a succession of different AI tools, to no avail.

Natalie Downe (Willison’s wife) overhears him arguing with the AI, and comes over to help. She applies tried and true front-end debugging tricks (not using AI), and they figure out that the gap is caused by the default CSS applied to <dialog> elements.

Mystery solved! But only through the intervention of a human being who was able to bring a different perspective and set of expertise to the task at hand.

When Willison got stuck, the way he tried to make progress on the problem was to keep demanding that LLMs tell him the answer, rather than to step back and apply his other problem-solving skills and domain experience.

Instead of asking himself questions like: “Is CSS causing this? Can I find the CSS that’s making the gap?”, he was trying to come up with different prompts to get AI to answer his question.

He was trying to solve the problem of the AI being wrong, rather than the actual problem at hand. Classic rabbit holing.

Which makes me wonder: do AI tools make it easier to get stuck in a rabbit hole?

Now, god knows I rabbit hole all the time all on my lonesome: no AI required.

When I notice I’m stuck down a rabbit hole, I take a break, walk away, and then ideally talk it out with someone. Or “rubber duck” it, where I talk through what I’m doing to an imaginary person.

Taking a step back lets you get some distance from your assumptions.

Talking it through with another person (or rubber duck) is even more helpful. Just the process of clearly explaining what you’re doing and why can lead you to recognize where you’ve missed a step, or made a bad assumption.

And that’s true even if the person you’re talking to has no clue about the problem. The value is in getting your ideas straight in your own head. I can’t count the number of times where I’ve started explaining my problem to someone, and then before I even get to the end of the sentence, I’ve figured out what the problem is.

Tangent: Interestingly, asking an LLM to explain its work is one of the tricks to reduce hallucinations.

But while LLMs may appear conversational, most of the time you’re not rubber ducking, you’re not explaining your ideas. You’re demanding a result.

“Make the gap between the dialog and the bottom of the screen go away.”

“Given this code, why is the dialog not taking up the full height?”

“You’re a frontend software engineer who knows all about the <dialog> element. Why does it not fill the whole screen?”

And if it can’t figure it out, you think about how to rephrase the prompt, or how to give it more context that might be helpful, and then you demand the answer again.

Now, maybe experienced LLM users would have more luck with rephrasing their prompts, or build rubber ducking into their practice.

But I’m concerned that using LLMs stifles critical thinking. We all agree that people blindly copy/pasting code from Stack Overflow is bad practice – is this really much different?

When the LLM tools failed to figure out where the gap was coming from, Willison turned to someone with more domain expertise than him.

In doing so, he learned that to suspect default CSS rules, learned that the Chrome Dev Tools can show you them, where Firefox does not, and then later dug into the HTML spec to see why the defaults were made.

He learned so much!

If the LLM had gotten it right the first time, by setting the max-height attribute on the <dialog>, he never would have learned any of this!

And then, the next time he ran into a problem that LLMs couldn’t solve, he wouldn’t have these tools to apply to the problem.

LLMs may (big may here imo) give you working code quickly, but they don’t give you any of the benefits of failure.

People who use LLMs in their day-to-day, I’d love to know what you think about this! Am I dead wrong? Does using LLMs for code generation increase your understanding of the domain (be it the coding language, the business logic, or something else)?

Willison, if you’re reading this, I really do respect your blog. While I likely disagree with you about the ethics around using generative AI, I don’t mean to judge you, or think you’re a bad person.

Thank you for your blog posts and the food for thought they’ve given me.