AI-assisted design system component identification
- The plan
- 5 production sites · detect the design system · inventory components across pages · score complexity
- Result
- identified 14, 37, and 23 component families across three of the five sites scanned
- Notes
- re-run against sites with established design systems; review cases where no formal design system is detectable
Plain language summary
Large websites tend to be built from a small set of components used over and over. Can an AI agent accurately name those components from the outside? This was the first test of whether that's possible and unlocks downstream accessibility agents.
It worked on the well-structured sites and found nothing on the others. But "found nothing" was ambiguous in a way a first pass can't resolve, and that ambiguity is what turned into the next experiment.
The question
Can AI reliably detect whether a production site is built on a design system, and inventory the component families it uses across pages? Reliable identification is the upstream step, and everything useful downstream depends on getting it right first. This is the first step toward putting component identification to work for accessibility down the road. It was the first pass, a feasibility check rather than a powered test.
How I tested it
Five production sites, run February 19 and 20, 2026. The mechanism came together in two passes. First, working interactively: crawl pages, cluster repeated structures, name candidate component families. Then a repeatable, deterministic extraction script, with per-site screenshots and component inventories as the output. Site names are withheld. What matters is the shape: large production sites, a mix of likely-systematized and likely not.
What happened
On three of the five sites, the mechanism surfaced a recurring component vocabulary, with inventories of 14, 37, and 23 component families repeating consistently across pages. On the other two it found no stable recurring vocabulary.
This was an identification pass only, with no accessibility assessment of the components it found. It was enough to show the mechanism works on light, well-structured sites, and to expose the obvious next question.
What it raised
Two of the five sites had "no design system found." A site can run on a real, published design system and still not look like one from the outside, depending on how it ships its components. That ambiguity, where "no system detected" might mean "no system" or might mean "a system the mechanism can't see," is exactly what a first pass can't settle.
That became its own experiment, and an instance of a loop I keep coming back to: take the deterministic extraction script, open it up to a non-deterministic, LLM-driven pass to fill the holes the first run exposed, then lock it back down and re-run against sites with known, published design systems, where the ground truth is fixed up front and the only question is whether the mechanism finds it.
Honest caveats
Five sites, two days, and exploratory. A feasibility probe, not a graded test, so read the numbers loosely:
- The inventories weren't scored against a known-correct list of each site's components.
- A count of 14 families means 14 the mechanism found, not 14 the design systems team would formally name.
- Where it reported no design system, that means none was detected, not that none exists.
Closing that gap is the re-run.
What's next
The re-run: detecting a design system on sites that are known to run one. Same mechanism, pointed at sites with known, published design systems, with an honest account of where it works and where it fails.