Currently open · Lab notebook

Pushing LLMs at autonomous game creation.

A solo lab on a question I find interesting: how far can a language model take game design when we stop adding scaffolding and start adding iteration? So far: 77 games, three pipelines tried, one retired, one in flight, one being designed.

Total games
77
create-game
41
simple-game-v0
25
magnum-opus
5

Field notes

7 entries
essay2026-05-12

The Pipeline — A Record Before the Pivot

We built a multi-agent game studio with eight specialists. It shipped 47 games. The architecture made sense in 2024, earned real wins, and accreted into the shape the field is now publicly post-mortem-ing. The record before we tore most of it down.

diary2026-05-09

Terracotta Courier (cycle 2) · diary #52

Magnum-opus thesis VALIDATED on first 4-session game. Zero IMMUTABLE drift across cycles 1 + 2; 5/5 cycle-1 IFC bullets preserved; cycle-2 thesis test (zone-2-screen-2 UNREACHABLE without double-jump) verified by simulator + designer paper trace. v3.7.8 trigger-predicate rule VINDICATED on first under-the-rule test — all 3 cycle-2 IFC bullets fired on specified predicates without broader-precondition drift.

diary2026-05-08

Terracotta Courier · diary #51

I see two viable proposal targets, ranked by ROI:

diary2026-05-07

Pin & Reach (Cycle 2) · diary #50

**THESIS: VALIDATED.** Cycle 2 shipped the un-sliced magnum-opus build (drops `-slice` qualifier): zone 2 "The Cliff Above" with 4 mixed-width segments (320/320/640/480 px), both cycle-2 verbs (place-piton, rope-swing) implemented per cycle-1 IMMUTABLE LOCK, camera horizontal-follow extension, save persistence v0.5, and **218 lines of ENGINE ACCRETION** graduated from cycle-1 game.html into `engines/vertical-climb-engine.html` (1766 → 1984, +12.3%). All 14 IMMUTABLE clauses CONFIRMED at cycle-2 ship (camera extended per RATIFIED cycle-2 mechanism — IMMUTABLE itself byte-identical to cycle-1 ship `b43066b6`). All 5 cycle-1 IFC bullets CONFIRMED-WELL with zero regression; both 2 cycle-2 IFC bullets CONFIRMED-WELL. Falsifiability hypothesis structurally encoded in level data (Z2-S2 41.76 px gap > 40 px reach radius, bridgeable only via piton at exactly 40+40 px) — geometry IS the test, not a verbal claim. The cycle-1 → cycle-2 IMMUTABLE locked-verb pattern WORKED end-to-end as the prophylactic for the IMMUTABLE-precondition-violation class (compass-apprentice cycle-2 Bug B canonical case). HYBRID engine-accretion order (4 candidates BEFORE P4.2 + 1 candidate AFTER P4.10) bounded regression risk while producing measurable thesis ROI. v3.7.8 Game-Internal Formula Preconditions Audit PAID OFF in its first cycle-2 application — caught Z2-S4's summitRevealY at design-time before any code was written (cycle-1's same class was caught at QA-time). Substitute-signal protocol (geometric audit + runtime --play + designer hand-trace + IMMUTABLE byte-identity diff) matured to a 4-source playbook on its 5th consecutive successful application. Player-agent image-API failed for the 5th consecutive cycle (compass-apprentice ×2, lantern-descent cycle-2, pin-and-reach ×2) — framework-level reliability gap is now unmistakable. PA5 telemetry-provenance MAJOR caught at P4.6 (developer hardcoded `pitonSfxLastPeakGain = 0.45` for the new cycle-2 PA instead of runtime-sampling) shows v3.7.1 discipline doesn't auto-extend to cycle-2 PA additions.

diary2026-05-07

Pin & Reach · diary #49

Fourth magnum-opus and the first to ship as a true cycle-1 *slice* — a half-game whose cycle-2 verbs (place-piton, rope-swing) are pre-locked in IMMUTABLE rather than discovered later, and the studio's first vertical-climb engine lineage (1766 lines forking the base-game chassis but inverting the platformer verb model: discrete grip-to-grip transitions, no continuous velocity, no jump arc). Two unproven hypotheses tested simultaneously — that locking cycle-2 verbs early prevents the IMMUTABLE-precondition-violation class (compass-apprentice cycle-2 bug A/B), and that the slice tier can absorb a brand-new engine genre at the cost of two genre-mismatched ship gates (`--completable`, `--journey` both auto-skipped). The single MAJOR (Signature Moment choir trigger geometrically unreachable: `summitRevealY = 24 - 120 = -96` against a camera Y clamped >= 0) was a game-internal formula carrying an implicit precondition about its own geometry — same class as compass-apprentice cycle-2's bug B turned inward, invisible to runtime tests because runtime tests don't reach segment 4. Iteration 2's 24-line pinnedTone polish closed the only weakly-confirmed Identity Feel Contract bullet (5/5 CONFIRMED-AT-SHIP). Player-agent image-API has now failed 4 consecutive cycles — no longer a per-game incident but a framework-level reliability gap.

diary2026-05-06

Lantern Descent (Cycle 2) · diary #48

The v3.7.6 simulator constant contract shim is now battle-tested across two consecutive magnum-opus cycle-2 games. Recommend a v3.7.8 proactive precheck.**

diary2026-05-06

Lantern Descent · diary #47

The Identity Feel Contract was the structural protector of the signature visual move — second magnum-opus in a row.**