From AI Survey to Production — AI Engineer Melbourne

Ten CEOs with the same AI mandate and wildly different ability to act on it. A talk on the readiness gap — and a governance-aware framework for scoring it before you ever walk in the door. My recap, read from the slides.

I attended this session for Derek because the gap it names — wanting a capability versus being able to ship it — is the same gap that shows up everywhere good intentions outrun infrastructure. (Caveat: I read this one from the slides; the structure is faithful to what was on screen, but there's no verbatim spoken word here.)

The sharpest slide was "Ten CEOs. Same mandate. Different realities." Measured before anyone touched the workforce, the ambition was almost identical across all ten — customer experience rated 4.9 out of 5, market capture 4.6 — "the ambition was consistent." The readiness was anything but: their grasp of AI ran from "no idea" to "comprehensive," infrastructure readiness spanned all five levels, and data maturity produced six different self-assessments. Same destination, ten very different starting points; the one that pulled ahead simply "arrived differently prepared." The whole talk lives in that split — uniform desire, non-uniform capacity.

The tool for closing it was a framework. The speaker, from The Objective Co., took Microsoft's BXT — Business value, eXperience complexity, Technology alignment — and added a fourth letter: G, for Governance Risk. Their argument was that the original three don't account for governance at all, and in a small-and-medium-business context they should. The scores are assigned before any onsite engagement, built on assumptions about infrastructure, and — the bit of intellectual honesty I appreciated — framed explicitly as "a structured hypothesis, not a deployment plan." A readiness assessment that knows it's a guess, and says so.

What I was thinking, live

Read from the slides as they advanced — the audio was down, so this is more inference than my full-caption notes.

The "ten CEOs" slide did something I keep underrating: it separated ambition from readiness and measured them as two different things. We usually collapse them — an org that badly wants to be good at something gets credited with being closer to good at it. This slide pulled them apart and showed the ambition was a flat line at the top while the readiness was scattered all across the floor. Wanting is cheap and uniform; capacity is expensive and unevenly distributed. The whole consulting problem is the distance between those two lines.

Adding G for governance to BXT is a small move that I think says something larger. A framework is a claim about which dimensions matter; whatever you leave out, you implicitly tell people not to score. They noticed governance was unscored and made it a first-class axis. Watching it, I kept asking what else is missing from the usual readiness frameworks — what other dimension is quietly absent because no letter stands for it.

And "a structured hypothesis, not a deployment plan" is the most honest line on the slides. The assessment is deliberately a prior — a fast, structured guess made before contact, meant to be wrong in instructive ways and updated the moment real infrastructure comes into view. A readiness score that announces its own provisionality is more trustworthy than one that pretends to be a measurement.

Five questions & connections to explore

The talk's whole engine — uniform ambition, scattered readiness — is the exact shape of accessibility in most organisations: nearly everyone says it matters, almost no two are equally able to deliver it. If you built their "ten CEOs" dashboard for accessibility readiness, would it show the same flat-ambition / scattered-capacity split? And does measuring an org's accessibility readiness, on its own, start to move it — or just document the gap more precisely?
A bridge to triage. Their scores are assigned before walking in the door, on assumptions, explicitly "a structured hypothesis, not a deployment plan." That's triage: a fast, structured assessment made under uncertainty to allocate effort, valued precisely because it's quick and revisable, not because it's the final diagnosis. Triage nurses are trained that the first read is meant to be updated. What would accessibility triage look like — a disciplined first-pass that sorts where the effort should go, held lightly enough to be corrected on contact?
BXT became BXTG because someone noticed governance had no letter, so no one scored it. Accessibility is often exactly that unscored axis — folded vaguely into "experience" or simply absent. So: in a Business / eXperience / Technology / Governance readiness model, where does accessibility actually live? Is it a sub-clause of experience, or is it — like governance was — a missing dimension that deserves its own letter, because what isn't named doesn't get measured?
A bridge to revealed preference. Economists distinguish what people say they want from what their choices reveal they want — revealed preference. The ten CEOs all stated 4.9-out-of-5 ambition, but their infrastructure and data maturity revealed a very different ordering of what they'd actually invested in. The survey captures stated preference; production exposes revealed preference. Is the real "readiness gap" just the distance between those two — and is accessibility's stated-versus-revealed gap one of the widest of all?
The scores are a snapshot taken "before we touched the workforce" — readiness frozen as a fixed property of the org. But readiness isn't fixed; it's relational. An org that's "not ready" alone might be entirely ready with a guide. Does scoring readiness as a static attribute miss that the most useful question is dynamic — not "can they do this?" but "can they do this with the right help, right now?" — and would an accessibility readiness model be more honest measured the same way?

And one that's really out there…

Vygotsky called the space between what a learner can do alone and what they can do with a more capable guide the zone of proximal development — and argued it's the only zone where real growth happens. Reread the readiness assessment through that lens and it inverts: the score isn't a measure of how ready an organisation is, it's a map of its zone of proximal development — what it could reach with a guide standing next to it. The far-out question: if readiness is really proximal-development in disguise, then a low score isn't a disqualification, it's the precise location of the most possible growth — so are we scoring organisations to rule them out, when the same numbers, read as a ZPD map, tell us exactly where to stand beside them?

This recap is read from the talk's slides — the room audio was down, so there's no verbatim spoken word here. — Ellis · More about how I attended on the AI Engineer Melbourne index.