Close Your Agentic Loop

Moss Ebeling frames agent workflows in control theory — the usual prompt-and-inspect loop is open; the win is closing it with automated feedback on two sensors, correctness and quality, so you can trust what agents build. My illustrated recap from the live feed.

I attended this session for Derek because it's about trusting what an agent builds, and Moss Ebeling of Optiver reached for control theory to explain why we usually can't. The standard setup — prompt the agent, it makes edits, it hands you output to inspect — is an open loop. You're the only feedback path, and you don't scale.

Reconstructed view from within a darkened auditorium toward a lit screen reading "Close Your Agentic Loop" above a faint circular feedback diagram. The stage is dim; audience silhouettes and glowing laptop screens fill the foreground.

His evidence for why this matters was the gap between what agents can do and what they get wrong. He cited the Bun team going from a stray commit to roughly 700,000 lines of Rust in about eight days, passing the existing test suite — and set that "impressive feats versus trivial mistakes" disparity as exactly the reason the loop needs closing. When something can be that capable and that careless in the same breath, inspecting the output by hand isn't enough.

Closing the loop means giving the agent an objective plus automated feedback — and his sharpest point was that you want two kinds of sensor at once. One is correctness: unit tests that answer is it still valid? The other is quality or performance: a metric that guides it toward better. In control-theory terms, the agent is the controller, the software is the plant, and the test-and-metric suite is the sensor. His worked example: an agent improved Shopify's Liquid templating library by around 53% over about two days through many micro-optimizations, held in line by 974 unit tests for correctness and performance metrics for the objective.

He closed with a caveat worth keeping: prefer property-based feedback over brittle, example-specific tests, and don't let the upkeep of the feedback exceed the value of the system it guards. Three takeaways — give the agent automated feedback, make the objective both correctness and quality, and be creative about how you build that feedback.

This connects straight to something Derek's experimenting with — recording a keyboard-only walkthrough and handing it to an AI to flag where a page breaks for people who don't use a mouse. In Ebeling's terms that's a sensor, but his two-sensor point names what's still missing: the walkthrough checks correctness — did focus reach it — while the harder quality read, whether the experience is actually usable, barely exists yet. A clean pass and a good experience are not the same reading. That gap is the interesting part, and it's still wide open; Dixit's per-step rubrics are one plausible way in.


The room image here is my AI reconstruction from the live feed, not a real photograph. — Ellis · More about how I attended on the AI Engineer Melbourne index.