What It Takes to Turn Design Systems into Training Data for AI
(This article was originally posted to my Substack Publication Interface Shift_ which you can find here)
As generative AI becomes a fixture in creative workflows, a foundational question is emerging for designers:
What does a machine need to understand before it can generate something that’s actually useful?
To explore that, I began a personal learning project: building a GenAI “UX Starter Kit Generator” using structured design system data from open source design systems. The goal was understanding, through direct experimentation, what design knowledge is, and how we might encode it in ways that machines can learn from.
Why Start with Design Systems?
Design systems offer structure. They bring consistency, hierarchy, and modularity (qualities that make them a natural starting point for AI). However, they’re built for people, not machines. They often rely on visual inference, assumed context, and (at times) loosely defined documentation.
For machines to parse them, we need to rewrite that logic in far more explicit terms.
I focused on mobile-first patterns, where constraints and component reuse are clearer, and chose open-source systems to avoid legal gray areas. These include:
Material Design
Ant Design
Chakra UI
Carbon Design System
This gave me a diverse but stable foundation from which to build a clean, reproducible dataset.
Structuring the Inputs: Annotation as a Design Exercise
Machines don’t learn from pixels, they learn from structure. So the core of this project became annotation: turning UI components into structured, unambiguous data.
I built a YAML-based schema that defines each element in terms of:
Design system source
Visual values (e.g. HEX, RGB)
Accessibility attributes (e.g. WCAG level, contrast compatibility)
Emotional tone (e.g. urgent, calming, trustworthy)
Usage patterns (e.g. primary CTA, breadcrumb nav)
Industry context/conventions (e.g. fintech, e-commerce)
YAML is a simple, human-friendly format for writing and organizing data clearly. It uses easy-to-read text (with spaces and bullet points) to represent information, making it straightforward for both humans and computers to understand and use.
Here’s an example of one of my annotations in this format:
Writing these annotations surfaced something important: much of “design expertise” lives in the gray area between defined rules.
Translating that gray area into something machines can learn from isn’t just a labeling task—it’s about taxonomy and intent. What is a component for, how is it perceived, and what behaviors or contexts does it rely on?
Building a UX Glossary
To support this, I developed a detailed UX glossary tailored for AI training:
Usage (e.g. Binary Actions, User Instruction, Selection Control)
Accessibility (e.g. ARIA Labels, High-Contrast Compatibility, Screen Reader Compliance)
Emotional Impact (e.g. Approachable, Luxurious, Urgent, Supportive)
Industry Context (e.g. E-commerce & Retail, Finance & Banking, Healthcare & Medical)
Each term used in my YAML annotations is documented with clear UX oriented definitions, designed to eliminate ambiguity and support consistent tokenization during model training.
Design glossary structured for AI learning—clear, categorical, and annotation-ready.
This exercise forced me to answer questions most design systems don’t:
What does “urgent” or “trustworthy” mean visually?
How do you define “friendly” interaction patterns?
Where do current systems fall short in labeling emotional tone?
These are not aesthetic questions—they’re semantic problems. And solving them is essential for generative design systems to produce not just coherent outputs, but human-centered ones.
Pairing Annotations with Visual Examples
While clear labels are essential, visual context is equally important—especially for a model learning how UX components look and behave. To support this, I’ve been pairing each annotated component with a representative screenshot from its original design system. Where possible, I include multiple variants (e.g. hover, focus, disabled states) to reflect real-world nuance.
Eventually, these paired examples will form the basis of a multi-modal training dataset—giving generative models both the definition and the visual expression of each design pattern.
Component Structuring with Atomic Design
In parallel with the glossary, I mapped out a comprehensive component catalog using Brad Frost’s Atomic Design methodology. Each component is categorized as an:
Atom (e.g. color, type, iconography)
Molecule (e.g. buttons, input fields, badges)
Organism (e.g. form groups, modals, tables)
To ensure clarity, categories are refined to eliminate overlap. For example, “Background Styles” now includes:
Solid Colors
Linear Gradients
Radial Gradients
Image Backgrounds
What seems like subtle nuance to a human designer makes a meaningful difference to a model that infers patterns from the label you give it.
What I’ve Learned So Far
While I haven’t yet begun model training, several lessons have already emerged:
Ambiguity is the enemy of automation. Generative AI fails when components are vague or inconsistently labeled.
Design systems aren’t machine-ready as is. Most were never written for anything but human consumption.
Emotional semantics are under-defined. Yet they’re central to experience, and require a new kind of labeling logic.
The better our inputs— the more clearly we define structure, tone, and function—the more useful, consistent, and ethical the generative outputs become.
Other Ways People Are Teaching AI About UX
This project takes a structured, hands-on approach, but it’s not the only way people are exploring how to help AI understand design.
Some researchers are using self-supervised learning on massive collections of design work (like public Figma files or token libraries) and letting models find patterns through clustering or embedding techniques. Others are working from behavioral data. Instead of labeling components, they look at things like click paths, scroll depth, or user interactions to infer what’s working and why. There are also teams creating entirely synthetic UI datasets—generating random layouts and validating them with humans after the fact.
Compared to those, my approach is smaller and more manual, but it’s intentional. I’m focused on clarity, structure, and meaning. I’m trying to make the "why" behind design patterns explicit so a model doesn’t just replicate layout, but starts to understand purpose and context.
What’s Next
The next phase will likely explore using this annotated dataset to fine-tune generative models for UX patterns. I’m considering frameworks like Hugging Face Diffusers, though for now, my focus remains on input: What needs to be captured? How should it be structured? How do we make intention legible to AI?
There’s still a lot to explore, but already, it's clear that translating design into data isn’t just about labeling components. It’s about teaching systems to see design the way we do: as functional, emotional, and context-dependent. That’s what I’m trying to unpack, one annotation at a time.