Add page-agent training, build-time verification, and regression gating
Resolves the in-progress merge (extraction.ts, orchestrator.ts, sessions.ts)
by keeping the single-pass extraction architecture and re-scoping the
feedback / verification / regression feature onto it.
- agents/page.md (new): the single-pass page prompt is now a first-class,
loadable agent (inline default fallback), so it can be verified, trained,
and proposed as a contribution PR like the specialist agents.
- Build-time source-fidelity verification: extraction verifies each page's
HTML against its source image via the Feedback Agent and self-corrects once
on failure. Non-blocking — a run never fails if the Feedback Agent is
unavailable or unsure.
- Iterative feedback: a feedback re-run refines the prior reviewed body
through the Reader/Editor loop (with feedback injected) instead of
regenerating from the source images, converging across rounds.
- Feedback-driven training: corrections from a feedback run become a proposed
page.md improvement — an update PR for the library agent (gated by its
regression fixtures) or in-place training for a session-built agent.
- Regression gating: accepted output is captured as per-agent fixtures
(triggering image + accepted HTML) on close; any later agent update is
re-run against those fixtures and blocked if it regresses.
Files: agents/page.md, agents/feedback.md, src/pipeline/{extraction,feedback,
orchestrator,regression}.ts, src/store/paths.ts, src/routes/sessions.ts.
All four conflicted files pass `node --experimental-strip-types --check`;
no conflict markers remain.