In which the failure happens long before the machine gets involved.
The Machine Did Exactly What We Asked Of It
The word "agent" gets used loosely now. Sometimes it means a system with real leverage: it browses, executes code, and takes actions across other systems. Sometimes it just means handing a task to a chatbot and expecting it to come back done.
I've been fumbling with both ends of that spectrum for months. Claude Code sessions that drift the moment intent gets fuzzy. Research tasks that return fluent prose aimed at the wrong problem.
Software that acts. Software that only complies.
The failures feel different in texture but come from the same place. I approach each task with the confidence of someone who has managed people and projects for years, and the work often comes back wrong. Or more precisely: pointed slightly elsewhere, solving a problem adjacent to mine.
The output doesn't fail the way a junior employee's work fails, where you can see the effort and trace the misunderstanding. It fails in a cleaner, more disorienting way. Polished, assured, and aimed somewhere I never intended.
There's a specific quality to that moment. The document opens. The formatting is clean. The first sentence sounds right. And then, slowly, the feeling that something is off, like walking into a room and realizing the furniture has been rearranged. Everything is where it should be. None of it is yours.
When this happens, someone on my team has a line I've come to deeply appreciate:
"The machine did exactly what we asked of it."
It's funny because it's true, and uncomfortable because of what it implies. The failure wasn't in the execution. It happened much earlier, in the handoff, in the space between what I meant and what I managed to say.
If you've spent time in organizations, the pattern is familiar. A project drifts off course, and when you reconstruct what happened, the breakdown is almost never technical. Someone had a goal in their head that never made it into words. Context that felt obvious went unshared. The work ran for weeks without a checkpoint, and by the time anyone looked closely, the cost of correction had compounded into the cost of starting over.
No one maintained the shape of the work while it was happening. That's why it fell apart.
I've watched this pattern play out in conference rooms and Slack channels for years. What I didn't expect was how clearly AI agents would reproduce it, or how quickly.
When I hand off a task to an agent and the output drifts, moving cleanly in a direction I never intended, the drift becomes visible in minutes rather than weeks. There's no social buffer, no status update that sounds productive, no reassuring meeting where everyone nods along. Just the work itself, returned to me, making the gap between intention and instruction impossible to ignore.
The machine did exactly what we asked of it.
The question is why we asked for that.
Many years ago, I sat through a presentation at work that everyone praised afterward. The speaker was talented, the kind of person who makes complexity feel manageable just by the way they hold a room. They painted a vision, told a story, and built momentum. People left energized.
But in the row behind me, a software engineer was muttering. Not heckling, just a low, running commentary I could only imagine, not hear. While everyone else was swept up in the performance, this person was looking underneath it, at real or perceived issues the presentation had smoothed over.
Meanwhile, I was nodding along with everyone else. I remember that now more clearly than anything the speaker actually said.
I think about that engineer often. Not because I enjoy cynicism, but because I've learned to trust people who can't be impressed by delivery alone. The ones who sit with the substance even when the presentation is dazzling.
Confident speech in organizations hides problems. It smooths over uncertainty by making ambiguity feel temporary, already handled. A vague strategy sounds visionary if delivered well. Wandering projects can feel productive if someone narrates them with enough certainty. Most of us learn to be careful about voicing doubt when the room is nodding along.
The social cost of skepticism is real. Questioning a charismatic leader, especially in public, carries risk. So problems hide behind eloquence, sheltering in the gap between how persuasive something sounds and how well it actually holds together.
With people, the smoothing usually takes the form of promise. I've got it. I'll figure it out. I'll circle back once I understand more. Ambiguity is deferred, not resolved. The work keeps moving on the assumption that someone will correct course later.
With agents, the smoothing happens differently. There is no promise, only compliance. Whatever you failed to specify hardens immediately into output, whether or not it matches your intent.
When I first started using AI agents for meaningful work, I assumed the confidence problem would disappear. There are no egos to protect or reputations on the line. No relationships smoothing over gaps. Just the work, returned without performance.
What I didn't account for was how confidently the work would be presented.
AI agents don't hedge. The prose arrives fluent and grammatically pristine. It sounds like it knows what it's doing even when the substance has drifted completely off target. So the confident speech problem doesn't vanish. It changes form. You're alone with the output, just prose that sounds equally certain whether it's right or wrong.
The question I keep sitting with is whether that makes problems easier to see. Or whether I just think it does because I want that to be true.
I don't have a clean answer. But I've noticed something in my own habits.
When a colleague presents work confidently, I often have to work up a small courage to push back. There's a relationship to consider and tone to manage. Even when I see gaps, I'm calculating how to name them without sounding like I'm attacking the person.
When an agent presents work confidently, that calculation disappears. I can be the cynic in the back row without social cost. I can look at fluent prose and ask, flatly, is this correct? I can say "the machine did exactly what we asked of it" and mean it as an indictment of my own instructions. Whatever structure exists has to live in the instructions themselves.
The barrier to questioning those instructions, and their results, is now much lower. Performance can't dazzle me into silence.
But that's only useful if I've actually developed the disposition. The engineer in the back row wasn't doing something anyone could do on instinct. He'd trained himself to look underneath. Working with agents seems to require that training every time, because the system will never pause and say it thinks something went wrong.
A few weeks after a particularly expensive detour, I tried again with a similar task. This time I wrote out exactly what I needed before opening the tool. I named the question. I explained why it mattered and how I'd use the output. I listed what I didn't want. I broke the work into parts and asked for the first before the second.
What struck me afterward wasn't the improvement in output. It was how rarely I bring that level of preparation to human collaboration. The same discipline that made the agent useful (clear outcomes, context supplied up front, work broken into checkable pieces, feedback arriving while the work is still in motion) is what makes delegation work anywhere. I know this.
A lot of what I used to credit as common sense in the people I worked with was them compensating for instructions I never finished giving. Knowing and practicing are different, and the gap between the two is where coordination failures live.
Agents compress the feedback loop on that gap. In human teams, drift is constant but rarely linear. People correct in small ways: a pause, a raised eyebrow, a clarifying question, a moment of hesitation that pulls the work back toward center.
With an agent, there is no self-correction. The first instruction sets the direction, and the work travels cleanly on that vector until it hits something solid. Without wobble.
Agents don't give you weeks to discover that your initial direction was vague. You get minutes. They don't let you blame the drift on interpretation.
The machine did exactly what you asked.
This compression can feel like a technology problem. The agent isn't good enough, isn't smart enough, doesn't understand what we really meant. But the longer I work with these tools, the more I think the compression is the feature. It shows me, faster than any human collaboration could, exactly where my coordination habits break down.
Maybe that's the real mirror. Not just that agents expose coordination failures, but that they reveal how much social performance has been obscuring those failures all along. The problems were always there. They were hiding behind charisma, behind the discomfort of doubt, behind smooth delivery.
The machine did exactly what we asked of it. Now we have to sit with what we asked for.
The interfaces we have for this work don't help yet. A command line built for issuing instructions is a poor place to manage anything that unfolds over time. You type a thing, you wait, you're told something, you react. This model encourages one-shot delegation and delayed judgment.
Long-running work needs different rhythms. Places to pause and check orientation. Moments where drift can be caught early, while correction is still cheap. Attention on the shape of the work, not just the output.
I don't know what those rhythms look like yet for agent collaboration. We're early. The tools are changing faster than the practices around them. But I suspect the answer has less to do with better prompts and more to do with steadier attention. The kind that treats delegation as something you stay with rather than something you hand off.
It's tempting to turn this into a question about whether the tools will improve. The interesting question isn't when or how all that happens, but whether we'll get better at the same speed. Whether the compression these tools create in our work, the speed at which they surface coordination failures, will teach us something about how we work. Or whether we'll find new ways to look away.
For now, the machine does exactly what I ask of it. The cursor blinks, the output waits, and the gap between what I meant and what I said is exactly as wide as I left it.
Footnotes
I tell myself I admire the engineer in the back row. The truth is I rarely am him, or not as much as I'd prefer. The feeling of being carried by a persuasive narrative can work on me as much as anyone. There's relief in not having to be the person who interrupts momentum.
If agents make skepticism cheaper, that doesn't mean we'll suddenly all become braver. It just means the cost structure changed. Habits don't.
| Published | 4 January 2026 (2 months ago) |
|---|---|
| Reading time | 11 min |
| Tags | engineering, systems thinking, ai |
| Views | – |
Reply
I’d welcome your thoughts on this essay. Send me a note →
