~/projects/lynx — claude --plugin lynx

Lynx · v1.1.1 · MIT · Real-browser audit

Sharp-eyed visual audits.
"It compiled." isn’t a verdict.

Lynx audits live web UIs through real browser sessions. Two skills capture evidence; one agent writes the verdict. Every PASS/FAIL cites a real screenshot, a real accessibility tree, a real console log — captured through the agent-browser MCP. No fixtures. No DOM stubs. No mocks.

View on GitHub →Install ↓Catalog entry ↗

14/14

detection

skills

commands

hooks

agent

01What it does

Three properties. Zero compromises.

Lynx is not a unit-test runner, not a screenshot diff tool, not a CI script. It is a plugin that audits the rendered system. Two skills do the seeing. One agent writes the verdict. The producer of evidence is never the writer of the verdict.

Real browsers only

Every verdict cites real screenshots, accessibility trees, and console logs captured through the agent-browser MCP. No fixtures. No DOM stubs. No mocks.

Entanglement detection

Catches multi-cycle defects: when fixing screen A unmasks a regression on screen B. WAM-class (cycle-2 reach) and synth-2-class (cycle-1 reach) both detected.

Frozen skills, honest cap

Two skills locked at 14/14 detection — modifying them requires re-running both shakedowns. Cycle cap fires honestly: UNFIXABLE beats a manufactured green run.

02Quick start

Install. Audit. Read the verdict.

Lynx ships as a Claude Code plugin. Add the marketplace, install the plugin, then run /lynx:audit against any rendered URL or local dev server. Two examples below — one full-sweep, one single-screen — drawn from real session logs.

bashexample 1 · install & full audit sweep

# 1. Add the marketplace and install the plugin.
$ claude plugin marketplace add krzemienski/lynx
$ claude plugin install lynx@lynx

# 2. Audit every screen reachable from a base URL.
$ /lynx:audit "http://localhost:3000"
# Phase 0 — preflight ............ scope = full sweep, agent-browser ready
# Phase 1 — audit (per screen) ... e2e-evidence/run-001/screen-{home,about,...}/
# Phase 2 — synthesis ............ verdict-writer reads all per-screen evidence
# Phase 3 — ship gate ............ run-verdict.md emitted
→ PASS — verdict cites e2e-evidence/run-001/screen-home/step-04-a11y.json:12-34
→ PASS — 14/14 in-scope screens · 0 entanglement defects · 0 cycle-cap hits

bashexample 2 · single-screen audit + fix loop

# Audit one screen with a known regression. Lynx fires the fix loop.
$ /lynx:audit-screen "http://localhost:3000/checkout"
# Phase 1 — ui-experience-audit (5-phase: triage → visual → interactive → content → UX)
⨯ FAIL — modal dismiss button hidden behind sticky footer at 375w (cite: screen-checkout/step-08-modal.png)
⨯ FAIL — APCA Lc 42 on primary CTA below threshold 60 (cite: screen-checkout/step-12-contrast.json)

# Fix loop fires automatically — max 3 attempts before UNFIXABLE.
$ /lynx:status
## Run 002 status
attempt 1 / 3 — modal-dismiss fix landed; APCA still failing
attempt 2 / 3 — primary CTA color promoted to #f0c14b; APCA Lc 64 ≥ 60
→ PASS — both findings cleared on attempt 2 (cite: screen-checkout/step-21-rerun.png)

# Render the run-verdict.md for review.
$ /lynx:report
→ verdict cites e2e-evidence/run-002/run-verdict.md:1-48

03Visual explainer

The audit pipeline. Four phases, two gates.

Every /lynx:audit run flows through the same four phases. Synthesis is the verdict gate (refuses without cited evidence). Ship Gate is the deploy gate (refuses on FAIL or UNFIXABLE). The fix loop returns to Audit, never to Synthesis — regenerating evidence is the only way past the gate.

04How it works

Four phases. Two gates.

The /lynx:audit command is the canonical entrypoint. Each phase has a single role, a single output artifact, and a single gate criterion. The producer of an artifact (the auditor skills) is structurally barred from writing its verdict — only verdict-writer can synthesize, and verdict-writer cannot capture evidence.

PH 00

Preflight

Scope decision: full sweep or single screen

PH 01

Audit

agent-browser MCP captures screenshots + a11y trees

PH 02

Synthesis

verdict-writer agent reads all per-screen evidence

PH 03

Ship Gate

PASS clears deploy; FAIL triggers fix loop (max 3/screen)

05Feature matrix

What ships in v1.1.1.

Counts grounded in the on-disk plugin tree at /Users/nick/Desktop/lynx/ per ARCHITECTURE.md capability snapshot (lines 18–29). No counts are estimated; each row enumerates the actual files.

One marketplace. Two skills, one agent.

Lynx installs as a Claude Code plugin. The plugin ships with both skills, all 4 slash commands, all 5 hooks, and the verdict-writer sub-agent. The hooks fire automatically on Write/Edit/TaskUpdate to block test-file creation, mock patterns, and unaudited completion claims.

bashinstall lynx into Claude Code

# Add marketplace + install plugin.
$ claude plugin marketplace add krzemienski/lynx
$ claude plugin install lynx@lynx

# Verify the install — hooks loaded, skills discoverable, agent registered.
$ claude plugin list
→ lynx@lynx · 2 skills · 4 commands · 5 hooks · 1 agent

# Audit anything reachable from your browser.
$ /lynx:audit "<your URL>"

07Documentation

Read the source. Then audit.

Lynx has no SaaS, no telemetry, no closed source. Every skill, command, hook, and rule lives in the public repository. Below: the eight pages most worth your time on day one.

A verdict without a citation is invalid. A citation to a directory is invalid. The auditor is never the verdict-writer. This is not advisory.

08Changelog

From two skills to a full enforcement bundle.

Lynx shipped v1.0.0 in late April 2026 with two production skills validated on Whac-A-Mole. v1.1.0 added the synth-2 shakedown (taking detection to 14/14), 5 enforcement hooks, 4 slash commands, the verdict-writer agent, and install/uninstall scripts. The skills are frozen at the 14/14 detection rate.

v1.1.1

2026-04-30

Manifest schema fix. Patch release: repairs a plugin-manifest validation failure that blocked installation under recent Claude Code releases. No skill or audit-pipeline behavior changes — detection stays 14/14.

v1.1.0

2026-04-29

Plugin enrichment. 5 hooks + 4 slash commands + verdict-writer sub-agent + install/uninstall scripts + 8 root docs + 3 mermaid diagrams. Detection accuracy carry-forward 14/14 across WAM 9/9 + synth-2 5/5.

v1.0.0

2026-04-23

Initial release. 2 production skills (full-ui-experience-audit + ui-experience-audit) + brand assets + marketplace stub + 14/14 detection validated on WAM + synth-2 shakedowns.

Sharp-eyed visual audits. "It compiled." isn’t a verdict.