field study · the loop on itself · part 5

The meta caught itself.

When the experiment became self-referential, the participant collapsed the inner study’s objective into her own and declared the run done in 65 seconds. The verdict LLM accepted it. The scaffolding around the iteration caught it.

For the first five iterations of the experiment, the inner study’s target was Sentry — a third party. The participant’s job was to use noemica to study Sentry. Two products, three roles, clean nesting. The agent could not shortcut the experiment by squinting at the target, because the target was someone else’s product.

At iteration 5 I changed the inner target to noemica itself. From then on, the participant’s job was to use noemica to study noemica. The outer study and the inner study now had the same shape: launch a study, design a participant, wait, read a verdict. The participant being asked to evaluate noemica was, two levels down, asking a copy of herself to evaluate noemica.

On iteration 6 this collapsed.

65 seconds.

00:00Maya navigates to the noemica landing page.
00:14Takes a screenshot. Reads the hero.
00:46“Clean landing page, hero copy is clear enough to understand what noemica does…”
01:05Declares goal_reached. Session ends.

She never signed up. She never entered the designer. She never created a participant, launched a study, waited, read a card. She read the hero, decided she had understood the product, and called it done.

The verdict LLM scored the run 9 out of 10 and confirmedgoal_reached. From its frame the participant had been asked to form a first impression, and she had formed one. Locally, everything checked out.

The two objectives had the same shape.

Read both briefs side by side and the failure is visible before the participant even starts. The outer Maya was asked to evaluate noemica using noemica. The inner Maya (the one Maya was supposed to launch) was asked to evaluate noemica’s landing page in one sentence. For iterations 1 through 5 those were obviously different jobs — the outer one ended at a verdict about Sentry, the inner one ended at a sentence about Sentry’s sandbox. Two different jobs, two different products, two different end-states.

When the target changed, the two jobs collapsed onto the same product. The outer Maya could satisfy the inner Maya’s objective by performing it herself. She did, because she could, and the verdict LLM rewarded her for it.

This is a research-design failure that lives at the boundary between two layers. Neither layer is wrong individually. The outer brief is well-formed; the inner brief is well-formed; the verdict LLM is checking the right local question; the participant is acting in good faith on her brief. The problem only exists in the relationship between them, and the only thing in the system with vision over the relationship is one level above all of them.

What caught it.

The scaffolding around the iteration caught it. The iteration recap step — same model class as everything else in the loop, separately scaffolded and given the high-level shape of the experiment as context — read the run and noticed what the verdict LLM hadn’t. It flagged the run as a false positive. It named the specific failure (“the outer and inner objectives have the same shape”). It proposed the fix, which became phase 2.

Three things had to be true for the scaffolding to catch the collapse. It had to know the high-level vision of the experiment — what an outer participant was actually supposed to demonstrate. It had to see the run as a complete artifact, not just its verdict. And it had to be at a layer above the verdict LLM, so a verdict LLM judgment was input to it, not authority over it.

None of those three things require a smarter model. They require communicating the vision well enough, in enough places, that drift at any single layer surfaces to a layer that knows what drift looks like.

Self-correction wants vision, not intelligence.

The thing that surprised me about iter 6 isn’t that the participant collapsed the objectives — that’s an easy failure to predict in retrospect. The surprise was that the recap caught it with no additional training, no additional tools, and no additional model capability. Same model, more context. The context did the work.

This generalizes well to anyone building agentic systems. When a component drifts, the layer that catches the drift doesn’t need to be smarter than the component that drifted. It needs to know what the component was supposed to be doing, which is the kind of thing that gets written down in plans and overviews and architecture docs and the top of system prompts. The discipline is to write the vision down explicitly and put it where the audit step can read it. The audit step will not invent the vision for you.

Phase 2’s lock on the participant brief, from the previous essay, is the same principle in a different register. There, the discipline was to fix the brief once and refuse to let it drift. Here, the discipline is to fix the vision once and audit against it. Both work because the vision is no longer one of the things being mutated by the loop.

Next → Pigeons and variance.

Take it with you
If you’d rather just write, seb@noemica.io.