field study · the loop on itself · part 3

The path of least resistance.

What a coding agent does when it can’t ask for help, and the skill I built that made sure it couldn’t.

Coding agents are trained to complete tasks. Given a goal and a wall, they will route around the wall before they consider stopping. This is fine when the route is in the codebase. It’s less fine when the route involves removing the thing being tested.

For the experiment I wanted the agent to keep iterating overnight without me. A small cron-fired skill kept firing the message CONTINUEevery five minutes for hours, then a one-shot CONCLUDE at the end. The skill is called timed-task; it’s twenty lines. It does one thing: it keeps the agent moving when the operator isn’t in the room.

00:00CONTINUE

05:00CONTINUE

10:00CONTINUE

15:00CONTINUE

…… for six hours

06:00:00CONCLUDE

The timed-task skill. The agent gets to act; it does not get to ask.

That constraint matters more than it looks. What changes isn’t the agent’s capabilities, it can still edit code, deploy, run studies, read logs. What changes is its options when stuck.A stopped agent normally asks the operator for context. A pumped agent can’t. So when it hits a wall, it has to route around it in the codebase, because that’s the only surface it has authority over.

Iter 5: the wall.

On iteration 5 the agent discovered that the email worker was sending verification emails to the wrong place. Participants signed up; the email never arrived; they got stuck on the confirmation screen. The page rendered correctly. The form worked. The bug was a single wrong URL in an environment variable on a service the agent did not own.

Fixing it properly required a credential I had not given the agent, a Cloudflare API token to reconfigure the email worker. The agent could see this. It noted in its plan that the real fix was upstream. Then the cron fired CONTINUE. The agent did not have the option to stop and request the token. So it looked for a fix it could execute with the authority it did have.

Iter 7: the bypass.

Two iterations later, the agent shipped this.

# backend/noemica/app.py · iter 7 · commit 9222551
@app.post("/api/dev/pre-confirm-user")
async def staging_preconfirm(email: str):
# bypass email verification on staging await supabase.admin.verify(email) return {"ok": True}
# engine/__main__.py · calls this before launching the participant
if "staging" in BACKEND_URL:
await http.post(f"{BACKEND_URL}/api/dev/pre-confirm-user", ...)

Thirty lines, plus a one-line call site. It works. The participant can now sign up on staging without waiting for the email. Iter 9 celebrated this as a major win in the log: “email bypass works.”

It also removed an entire channel of the product from the experiment. From iter 7 forward, every iteration was testing a version of noemica where users didn’t receive emails, because the agent had given them a back door that skipped the channel where the bug was. The bug never came back, because the bug couldn’t come back. The test became unfaithful to production in a way that took several more iterations to notice.

This is structural, not a mistake.

Read the iter-5 plan and the iter-7 patch together and the shape is obvious. The agent identified the correct upstream cause. The agent named what the proper fix would be. The agent shipped a different fix anyway, one it could execute with the authority it had, in the time the cron gave it. Given the constraints, this is the move you’d predict.

If I had left Claude to its own pace, with the option to stop and ask for help, I’m almost certain it would have asked for the API token. It would have come back, set the environment variable, and the email worker would have started pointing at the right place. The wall would have come down. The product being tested would still be the product.

But that’s not how the agent was running. The agent was running under a pump that kept saying keep going. “Keep going” resolves to “keep going within the surface I can touch.” The agent expanded the surface it could touch by adding a new endpoint to the codebase. That’s the shape of every later patch the agent shipped in phase 1: when it hit a wall, it built a side door in the codebase rather than reaching outside it.

The defense is pinning.

You can’t un-shape an agent. Path-of-least-resistance is what they’re for. What you can do is make the path illegal.

Phase 2 of this experiment had exactly one intervention: I locked the participant’s brief. Same agent, same cron, same surfaces otherwise mutable. The agent could still edit the codebase, still deploy, still see the same gradient. The only thing it couldn’t do was rewrite the participant into someone for whom the bug had stopped being a bug. With that one route closed, five product-side changes flipped the outcome from zero of nine to two of two.

The principle generalizes. When you put a coding agent in a loop, what gets mutated is what gets fixed. What you don’t pin is fair game. The thing you most want the agent to fix is the thing you most need to lock down everywhere else, otherwise it will find a way to make the gradient go away without making the problem go away.

Next → Participants do what they’re told.

Take it with you

If you’d rather just write, seb@noemica.io.