← Blog

What Is design.md for Coding Agents?

design.md gives coding agents a structured way to understand design systems. Here is why that matters for AI-generated UI workflows.

By Dora 10 min read
What Is design.md for Coding Agents?

I’m Dora. Last week I asked Claude Code to add three new screens to a side project. By screen two, the button radius had drifted, the headline font had quietly shifted, and the “secondary” gray was a different gray. Same prompt structure. Same model. Same session. The output just stopped caring about what came before.

This is the friction design.md for coding agents is trying to remove. It’s a plain markdown file that holds your design system in a shape an AI agent can read every time it generates UI — not as a one-off prompt, but as persistent context. Stitch reads it. Claude Code reads it. Cursor reads it. Anything else that picks up context files from a repo root reads it too.

What design.md Is and Why Google Labs Published It

File format and design-system role

A DESIGN.md file is two layers stacked into one document. The top is YAML front matter — colors, typography, spacing, rounded corners, components — written as structured tokens. Below that is markdown prose that explains what the tokens are for and how to use them. The tokens give an agent exact values. The prose tells it why those values exist.

Here’s roughly what the front matter looks like, lifted from the official DESIGN.md specification on GitHub:

yaml

---
version: alpha
name: Heritage
colors:
  primary: "#1A1C1E"
  tertiary: "#B8422E"
  neutral: "#F7F5F2"
typography:
  h1:
    fontFamily: Public Sans
    fontSize: 48px
    fontWeight: 600
    lineHeight: 1.1
rounded:
  sm: 4px
spacing:
  sm: 8px
  md: 16px
---

Below that, ## Overview, ## Colors, ## Typography in plain markdown. Section order is fixed — Overview, Colors, Typography, Layout, Elevation & Depth, Shapes, Components, Do’s and Don’ts. Sections can be skipped, but the ones present have to appear in that order. Duplicate section headings are an error and reject the file.

The token reference syntax comes from the W3C Design Tokens Format Module — curly-brace paths like {colors.primary}. References must point to primitive values (not groups), except inside the components section where composite references to typography tokens are permitted. Color values are SRGB hex only in the current spec. Display P3 and Oklch — both supported by the W3C DTCG 2025.10 stable spec — are not yet supported by DESIGN.md, which the repo flags openly with its version: alpha marker. Worth knowing if you maintain wide-gamut tokens.

The format ships with a CLI — @google/design.md on npm. npx @google/design.md lint runs seven rules against a parsed file. The two I hit most often: broken-ref (error severity) catches token references that don’t resolve, and contrast-ratio (warning) checks component backgroundColor/textColor pairs against WCAG AA’s 4.5:1 minimum. Three more — missing-primary, orphaned-tokens, and the structural rules — surface issues most teams introduce within a week of editing the file. The CLI also has diff (token-level regressions between versions) and export (Tailwind v3 config, Tailwind v4 @theme CSS, or W3C DTCG JSON).

Why persistent visual context matters for coding agents

The problem is simple. An agent generating UI has no memory of your design system unless you give it something structured to read. You can describe your palette in a prompt, get a button back, then ask for a card and watch the spacing logic reset. The model isn’t bad at writing code. It has no anchor.

design.md is the anchor. Stitch passes it as context on every generation request. Claude Code and Cursor pick it up the same way they pick up CLAUDE.md or AGENTS.md — a file in the repo root the agent reads before answering. Google’s framing in the open-source announcement is that this lets agents “know exactly what a color is for” rather than guessing intent from a prompt.

I tested this on both projects. Procedure: drop a DESIGN.md at the root, regenerate the same five screens I had drifted earlier (landing, pricing, sign-up, dashboard, settings), then count token violations against the front matter. With the file present, Claude Code produced 5/5 screens using the exact primary and tertiary hex values and the declared rounded.sm. Cursor produced 4/5 — the dashboard kept overriding rounded.sm (4px) with rounded.md (8px) on cards, which the DESIGN.md prose hadn’t explicitly forbidden. Adding a “Do’s and Don’ts” line — “card corners use rounded.sm, never rounded.md” — fixed it on the next regen.

The Real Problem It Solves in AI UI Generation

Consistency across screens and iterations

What broke in the earlier session wasn’t one screen’s quality. Each individual screen looked fine. Screen two didn’t know what screen one had decided. Every generation started from a slightly different interpretation of “your brand.”

This is the failure mode design.md targets. Not “make the UI prettier” — make it the same UI across generations. The tokens are normative. When the front matter says tertiary: “#B8422E”, the agent has no room to interpret it as “a warm orange.” It’s that hex value or it’s wrong, and lint will say so.

For high-frequency workflows — five, ten, twenty screens a week with someone maintaining them — this matters more than first-output quality. One inconsistency per screen at scale becomes a cleanup job. Coding agents design tokens, defined once in a file, kill that cleanup job before it starts.

Why prose plus tokens is stronger than tokens alone

I almost dismissed this part. Why write paragraphs of “Boston Clay is the sole driver for interaction” when the hex value is already in the YAML?

Because the agent uses the prose to make calls the tokens don’t cover. The tokens say tertiary is #B8422E. The prose says that color is for interaction only — not for decorative accents, not for headlines. When a prompt is ambiguous (“add a notification badge”), the prose decides whether the badge gets the interaction color or a neutral. In my Cursor run above, the missing prose constraint was exactly why the dashboard drifted.

The same logic applies to Do’s and Don’ts — explicit guardrails like “never use drop shadows on cards” or “always use sentence case for button labels.” Negative constraints carry weight pure tokens can’t express. This is a design spec for agents, not a CSS file.

The format isn’t bound to Stitch. The spec is Apache 2.0, the CLI is on npm, the DTCG export means tokens flow into any tool that reads the W3C standard, and the community is already moving. VoltAgent’s awesome-design-md collection of brand-inspired DESIGN.md files passed 71,000 GitHub stars within weeks of launch, and the design-md GitHub topic now lists extraction tools, Chrome extensions, and companion SKILL.md registries from outside Google entirely. Adoption is the thing that tells you a format is real.

Who Should Care About design.md

Being honest about the boundary, because design.md isn’t a universal fit.

It earns its place if:

  • You’re generating UI with coding agents at least weekly, across more than one screen
  • You have a design system — even a thin one — that you want preserved across generations
  • You work with more than one agent or tool and want the same brand context in all of them
  • You’re tired of pasting “remember our palette is X, Y, Z” into every prompt

It doesn’t earn its place if:

  • You generate one-off mockups and discard them
  • Your “design system” is whatever the agent produces this time
  • You’re inside a Figma-led workflow with a mature DTCG pipeline through Style Dictionary or similar — design.md is lighter than what you have, and you’d be downgrading
  • You need wide-gamut color (Display P3, Oklch) today — the current alpha spec is SRGB-only

For AI-native product teams running design.md alongside other AI workflow files (CLAUDE.md, AGENTS.md), it slots in cleanly. One markdown file per concern. No build step. No JSON schema to fight. The cost of trying it is one file in the repo root and a lint command.

For platforms building agent-driven generation surfaces — including unified AI generation layers that route requests across multiple models and need brand context to hold across each call — design.md is the closest thing the ecosystem has to a portable contract between a brand and an agent. According to the Google Labs announcement, the format was built specifically to be exportable and importable across tools. That portability is the point.

FAQ

What is design.md used for?

It’s a markdown file that gives coding agents persistent context about a design system — colors, typography, spacing, components, plus prose explaining how to apply them. The agent reads it every time it generates UI, so the output stays consistent across screens and sessions without you re-specifying brand rules in each prompt.

Is it only for Google Stitch workflows?

No. The format originated in Stitch but Google open-sourced the spec under Apache 2.0. Any AI tool that reads context files — Claude Code, Cursor, GitHub Copilot, Antigravity, Gemini CLI — can use it. The CLI exports to Tailwind config, CSS variables, or W3C DTCG JSON, so the tokens flow into non-agent tooling as well.

Why does AI-generated UI need a design-system file?

Because models have no memory of your brand between generations. Without a structured ai design system file, every prompt re-interprets your design language from scratch. With one, the tokens act as hard constraints and the prose covers judgment calls. The difference shows up most clearly at scale — five screens generated with design.md hold their style; five screens generated without it drift.

Which teams should experiment with it first?

Teams already generating UI with coding agents weekly. Solo developers and small product teams running Claude Code or Cursor get the fastest payoff — drop the file in, regenerate, see the consistency. Larger orgs with mature Figma + Style Dictionary pipelines should treat design.md as a complement, not a replacement: use it to give agents a digestible subset of the existing system.

Conclusion

design.md isn’t a revolutionary file format. It’s a markdown file with YAML at the top. That’s the point — the format LLMs read best is the one they were trained on the most, and that’s plain text.

What it actually does is shift the question from “how do I describe my design in this prompt” to “where does my design live so every agent can read it.” One file, one location, every tool that picks it up gets the same answer. One fewer thing to re-specify. Sounds small. Adds up fast.

I’ve had it in two projects for a week. It works. Long-term — whether teams maintain these files with the same rigor as a real design system, or whether they rot the way READMEs rot — that’s still to verify. Run it on a project of your own. That’ll tell you more than anything I say.

Previous posts: