WARD #4|MAY 03, 2026

Please Don’t Put Nurse Shift Colors in the Kernel

A field note about AI, demos, and the moment GS-TDD was no longer enough.

The first sign that something had gone wrong was not a failing test.

The tests were fine.

The demo worked.

The problem was that nurse shift colors had somehow found their way into the Rust kernel.

That is not a sentence you want to write about your rendering engine.


The setup

I was building vcore, a Rust/WASM scene engine, with heavy AI assistance.

The project was moving absurdly fast. Not "I made a TODO app with vibes" fast. More like two days of disciplined iteration, 24 bounded work units, 256+ tests, Rust/WASM, multiple verticals, and a rendering loop pushing thousands of entities at interactive frame rates.

The workflow was Gold Standard TDD: Red, then a human approval gate, then Gold, then refactor. The important part was never that tests existed. The important part was that the human reviewed the failing tests before the AI was allowed to implement anything.

The AI did not get to quietly define success for itself.

For individual units of behavior, this worked beautifully. Each piece had a spec. Each piece had tests. Each piece had a clear done condition.

Then I said the dangerous words:

Just build a demo.


The demo worked

The AI did what AI coding assistants are very good at doing.

It solved the task. It produced visible output. It connected things. It found the shortest route from "nothing on screen" to "look, it works."

That is an impressive skill.

It is also how architecture gets murdered in a hallway.

Because when the task is broad enough, the AI will happily cross boundaries that were obvious to me but never written down. Demo-specific concepts leaked into the engine layer. Colors that belonged to a nurse shift planning vertical ended up near the rendering kernel.

The leak looked roughly like this:

// vcore/src/kernel/render.rs
// BEFORE (the leak):
match shift_type {
    ShiftType::Nat => Color::Purple,
    ShiftType::Aften => Color::Orange,
    ShiftType::Dag => Color::Blue,
    // ...
}

The kernel was supposed to be boring. Entities, layout, spatial queries, render commands. It should not know whether a rectangle represents a nurse shift, a school lesson, or a payroll cell.

That meaning belonged outside the engine.

The kernel's job was to render rectangles correctly, not to understand them.

The boundary was restored when the rendering layer stopped asking what the color meant:

// vcore/src/kernel/render.rs
// AFTER (boundary restored):
fn render_cell(rect: Rect, color: Color) {
    // The kernel doesn't know what the color means.
    // It just renders rectangles correctly.
}

That is the entire architectural lesson in miniature.

The color can exist. The vertical can choose it. The UI can explain it. But the kernel does not get to know why purple means Nat.

Once it started understanding that, the boundary was gone.

The codebase did not explode. That would have been merciful.

Instead it kept working. Tests passing. Demo running. Commits feeling productive. The architecture just got a little softer.

One shortcut becomes a precedent. One precedent becomes a pattern. One pattern becomes a load-bearing wall you did not know you had.


The AI was not the villain

This is the part that matters.

The AI did not fail because it was bad at coding. It failed because it was too good at local problem solving without architectural boundaries.

Ask for a working demo, you get a working demo. Ask for a feature, you get a feature. The AI optimizes for the task you gave it. If the task is too wide, it optimizes across boundaries you never meant to make negotiable.

Unless the boundaries are explicit, the AI has no reliable reason to preserve the architecture that lives in your head. It does not lie awake worrying about whether the rendering kernel now knows too much about nurse scheduling.

That remains your job. Sorry.

I have also written a song about this failure mode. It is called Bad Feeling (Vibe Coder). It does not solve the problem either, but it rhymes.


What GS-TDD could not see

GS-TDD was still the right foundation. It gave the AI a local contract. It prevented implementation from drifting from behavior.

But the nurse-shift-colors incident exposed something it could not catch.

A test can say:

This renderer should draw a cell with the correct color.

It cannot easily say:

The Rust kernel must not know why this color exists.

Correct behavior is not the same as correct ownership. A system can pass its tests and still put responsibilities in the wrong place. It can be green and still be getting worse.

Tests define behavior inside a task. They do not define the architectural boundaries between tasks.

That is the gap.


Where this is going

The nurse shift colors should never have reached the kernel. But I am glad they did. They exposed the difference between "the code works" and "the architecture is intact", and they turned an annoying demo leak into a methodology.

That methodology is Ward-Driven Development.

WDD grew out of a simple discipline: nothing crosses a boundary unless the boundary is written down.

GS-TDD keeps the robot honest inside the room. WDD decides which room the robot is allowed to enter.

That is what the next post is about.


Next: The Ward — a room the AI is allowed to work in.

A Ward is not a ticket. It is a bounded contract with walls: scope, inputs, outputs, tests, must-do rules, must-not rules, and human approval gates.