We take a baked-text website mock image (from Steve's mock-engine) plus its YAML “twin” — the content contract listing every section → element → text — and rebuild it as real HTML. Two outputs: a pixel-faithful overlay and a semantic / SEO version. Same pipeline drives 8 wildly different aesthetics, with zero per-site code.
An image model can dream up beautiful, arbitrary design that CSS templates can't. But an image is just pixels — no text, no structure, no SEO. v7 keeps the beauty and puts the meaning back: it locates every known string in the picture, erases the baked text to a clean “plate,” and re-renders the words as real HTML on top. Because we already know every string from the twin, we never guess what's text — we only find where it is.
YAML twin (every string, by role) + the rendered mock image.
Find each known string's pixel box — EasyOCR, auto-escalating to Gemini vision for script fonts.
Photo-aware inpaint → a “clean plate” with text gone but imagery intact.
Real HTML text positioned over the plate — pixel-faithful to the mock.
Same visuals as crawlable HTML: h1/section/nav, lazy slices, img alt, JSON-LD.
Each of these was a real problem we hit and solved; together they're the logic.
Real HTML text absolutely positioned over the clean-plate picture. Reproduces any aesthetic the mock shows, exactly. Best when visual fidelity is the goal. Trade-off: it's a picture with text on top — weak for SEO and it scales rather than reflows.
Same visuals, emitted as <header>/<main>/<section>, one
<h1> then <h2>/<h3>, <nav>;
per-section background slices that lazy-load (better LCP); real photos as
<img alt>; CTAs as accessible links; plus JSON-LD, title & meta.
The “beautiful and crawlable” hybrid. Reflow (mobile) is the one deferred piece — a
separate phone mock will drive it.
Fidelity gate score (overlay vs mock). The judge is an LLM with ±1–2 noise; treat as a band.
| Site | Business | Aesthetic | Score | |
|---|---|---|---|---|
| saas_botanical | B2B SaaS | botanical | 9.8 | ship |
| redwood_diner | pizzeria | retro diner | 9 | ship |
| redwood_storybook | pizzeria | storybook | 9 | ship |
| redwood_tarot | pizzeria | engraved tarot | 9 | ship |
| redwood_vaporwave | pizzeria | neon vaporwave | 9 | ship |
| saas_deco | B2B SaaS | Art-Deco | 9 | ship |
| saas_memphis | B2B SaaS | Memphis | 9 | ship |
| redwood_pop | pizzeria | pop-art | 5 | limit |
Each links its overview hub (all pipeline stages), the pixel-faithful overlay, and the semantic SEO build.
One source of truth. lib.py holds paths, the Gemini key + call, the twin
schema, the role taxonomy, geometry/colour and the photo mask — no duplication, no cross-run imports.
One-command pipeline. run.py <site> runs localize → erase → overlay →
semantic → hub. run.py <fixture>:<world> goes from nothing → finished site
(build + render via mock-engine, then the pipeline).
Tests pin the fixes. test_lib.py (unit) + test_integration.py
(erase→overlay→semantic on a synthetic fixture) run offline via run_tests.sh.
Cost is logged. Every Gemini call's token usage is written to spend.log.
Reviewed. A high-recall code review found 10 issues; all fixed and pinned by tests.
Known limit. Pop-art (heavy outlined display type on busy halftone backgrounds) leaves residual smudges — the one non-shipping aesthetic, documented rather than hidden.