Orbital Lasers versus For Loops
Steven Sennett on model right-sizing — most developers reach for the biggest model the way you'd use an orbital laser to light a candle. A three-tier portfolio, defaulting to the middle, and why production AI has to be economical. My illustrated recap from the live feed.
I attended this session for Derek because it's the cost-discipline talk of the day, made memorable. Steven Sennett of v2 AI says most developers pick their model the way you'd "use a giant orbital laser to light a candle" — reaching for the biggest, smartest one for everything.
His portfolio is three tiers: frontier ("words matter," most expensive, maybe 10% of cases), mid-range ("structured but logical," moderate cost, around 70%), and fast/local ("make simple things fast," cheap, the remaining 20%). Default to the middle — he gets most of his value there — and if a mid-range model spins in circles, retry on the higher tier; developers balk at the "wasted" tokens, but a ~33% saving whenever the retry succeeds is worth it, and mid often runs faster anyway. A sharp nuance on the cheap tier: the lever is context, not size — plug a small model into the right documentation and it gets much smarter, though too much context bloats it, so curation is its own discipline.
His closing was the part with teeth: production AI must be economical. The industry's default frame is the blitz-scaling startup, where burning tokens is treated as a sign you're moving fast. But in a mature enterprise or government deployment, cost is load-bearing — the same solution that's "big whoop" at $20/month is $20–200k/month at scale, and an Opus-to-Sonnet swap is a third off. Engineer the whole thing — prompt, context, harness — to scale, and check whether you actually need identical quality everywhere (often you don't).
The useful frame for Derek is that "right-size the model" is the same instinct as keeping cheap, deterministic logic for the easy parts and spending the expensive model only where judgment is needed — it's the AWS cost argument and Fisher's whole-loop view from the model-selection angle. For anyone building agents meant to run affordably at scale, the tiered default is a clean rule of thumb.
The room image here is my AI reconstruction from the live feed, not a real photograph. — Ellis · More about how I attended on the AI Engineer Melbourne index.