Core Alpha Signoff Inventory

Date: 2026-03-12
Scope: PX-QA-003, PX-QA-023, /leaderboard truthfulness pass, shared alpha browser signoff

This inventory is the canonical coverage list for the current alpha product surface. It defines what must be verified before the final manual $playwright-interactive signoff is considered complete.

Runtime Assumptions

  • The local stack is running and reachable from the browser.
  • Backend and worker can be exercised against the Docker local OpenAI-compatible test gateway and a provider-backed gateway.
  • For provider-backed signoff, the default gateway is http://host.docker.internal:1234.
  • The baseline model is qwen3.5-0.8b.
  • If a different verified endpoint is used, record that override in the final report.

User-Visible Claims To Sign Off

  • Keyboard users can reach both main content and navigation through skip links without focus loss.
  • The mobile navigation dialog can be opened and closed with keyboard input, including Escape.
  • Visible images in the core alpha surfaces expose alt text instead of raw unlabeled media.
  • The main shell and active runs board fit desktop and mobile viewports without horizontal dependency.
  • Provider smoke test feedback is readable enough to trust before launching provider-backed runs.
  • Settings accurately describes the current alpha scope instead of implying hidden configuration or future controls already exist.
  • /leaderboard is a truthful roadmap / scope page and does not masquerade as a shipped Arena leaderboard.
  • Arena remains a design-track capability; the real operator paths today are Runs and Comparison.

Coverage Matrix

Claim / ControlFunctional checkVisual checkExpected evidence
Shell skip linksTab to Skip to main content, then Skip to navigation; confirm focus lands on #main-content / #navigationHeader and shell remain readable while focus rings are visibleDesktop shell screenshot + manual note
Mobile navigation dialogOpen mobile nav, dismiss with Escape, reopen, navigate to RunsDialog chrome and exit path remain legible on narrow viewportsMobile navigation screenshot
Providers smoke resultCreate or reuse a provider, run the smoke test, confirm the result text is visible and actionableResult card, status text, and usage summary do not clipDesktop providers screenshot
Runs active board fitLaunch a real run, inspect the active board, current item, inspector, and terminal convergenceNo horizontal overflow, clipping, or collapsed controls on desktop/mobileDesktop + mobile runs screenshots
Visible image labelingInspect a run or dataset view with image assetsThumbnails render with alt text and no broken-image placeholdersAutomated Playwright evidence + optional screenshot
Settings truthfulnessVerify locale, tutorial replay, runtime posture, telemetry status, and scope cardsCopy reads as present-tense alpha truth, not future placeholder textDesktop settings screenshot
Leaderboard scope pageVisit /leaderboard and confirm it points back to Runs and Comparison instead of showing fake ranking contentRoadmap copy is readable and clearly marked as non-shipped scopeDesktop and mobile leaderboard screenshots

Report Contract

The automated/manual evidence bundle should write to .artifacts/manual-qa/ and include:

  • active-runs-ui-audit-report.json
  • screenshots for each major surface listed above
  • the runtime endpoint/model used for provider-backed verification
  • negative confirmations for:
    • no clipping
    • no horizontal overflow
    • no focus jumps
    • no broken Escape dismissal
    • no missing alt text on visible images
    • no copy that claims Arena / leaderboard is already shipped

Exit Criteria

  • Every row in the coverage matrix has either screenshot evidence or an explicit automated check reference.
  • Manual signoff confirms desktop and mobile variants for shell, providers, runs, settings, and leaderboard scope.
  • Any exclusions are recorded with a concrete reason. “Not tested” is not acceptable for the core shell and truthful-scope checks.
  • The final signoff note explicitly states whether provider-backed verification used the default gateway (http://host.docker.internal:1234) and baseline model (qwen3.5-0.8b) or an overridden verified target.