Validation Contract
<!-- dual-compat-start -->
Use When
- Authoring a new specialist skill and deciding which validation evidence it should produce.
- Normalising an older specialist skill against the current house style.
- Preparing a feature or release for ship and assembling the Release Evidence Bundle.
- Reviewing a PR that claims a feature is production-ready.
Do Not Use When
- The work is purely local, experimental, or explicitly throwaway with no path to production.
- The skill being authored is a baseline, process, or pure index skill. Those do not declare evidence.
- The task is unrelated to validation planning or shipping readiness.
Required Inputs
- The specialist skill or feature being validated, including its intended scope and risk tier.
- Access to the repository's existing validation skills (
advanced-testing-strategy
, , , etc.) as the source of category-specific "how to validate" content.
- Awareness of the 14 canonical artifact templates in
skill-composition-standards/references/
so evidence rows can cite existing formats.
Workflow
- Identify whether the skill or feature in scope is specialist (declares evidence) or baseline/process (does not).
- For specialist skills: map each artifact the skill produces to one of the seven evidence categories.
- For releases: produce a Release Evidence Bundle that links concrete artifacts under each of the seven categories, using only where an entire category legitimately does not apply.
- Cross-check risk tier guidance before permitting any in high-risk releases.
Quality Standards
- Every specialist skill that produces validation evidence declares at least one evidence category with a concrete artifact reference.
- Evidence declarations cite existing artifact templates where possible instead of inventing new formats.
- Release Evidence Bundles never carry empty cells. Every cell links evidence or carries an line.
Anti-Patterns
- Declaring every category on every skill "just in case". This kills the signal.
- Writing prose validation notes instead of linking concrete artifacts in a Release Evidence Bundle.
- Permitting unjustified on Correctness, Security, Data safety, Operability, or Release evidence in high-risk releases.
- Treating the Release Evidence Bundle as a retrospective summary written after ship. It is produced before ship.
Outputs
- A specialist skill with a validated section declaring one or more of the seven categories.
- A Release Evidence Bundle in the project's tree linking evidence for every applicable category at ship time.
- Clear annotations for non-applicable categories, with risk-tier-aware justification.
References
- references/evidence-categories.md: per-category definition, indicative contributing skills, and common artifact shapes.
- references/declaration-form.md: the table form, rules, and worked examples.
- references/release-evidence-bundle-template.md: the canonical fillable Release Evidence Bundle.
- references/integration-rollout.md: audit trail of edits made to other skills during this skill's rollout.
<!-- dual-compat-end -->
The three repository-wide contracts
The repository is held together by three contracts, each codified in a baseline skill:
- House-style contract — every skill follows the same shape. Source:
skill-composition-standards
, Standard 1.
- Inputs/Outputs contract — every skill declares the artifacts it consumes and produces. Source:
skill-composition-standards
, Standard 2.
- Evidence contract — every specialist skill declares which of seven fixed validation categories its artifacts contribute to, and every release produces a Release Evidence Bundle. Source: this skill.
The three contracts stack. A skill that meets Standard 1 but skips Standard 2 or 3 is not repository-grade.
The seven evidence categories
| # | Category | What the evidence proves |
|---|
| 1 | Correctness | Behaviour matches spec; tests cover risk surface; contracts hold. |
| 2 | Security | Threat model exists; scans clean; secrets handled; auth/authorisation verified. |
| 3 | Data safety | Schema integrity; migration safety; backup, retention, and PII handling. |
| 4 | Performance | Budgets met; load profile understood; query plans acceptable. |
| 5 | Operability | SLOs defined; runbook exists; observability wired; rollback plan ready. |
| 6 | UX quality | Accessibility pass; design audit; content/UX-writing review; AI slop check. |
| 7 | Release evidence | Change record; migration plan; rollout/rollback log; post-deploy verification. |
The taxonomy is closed. Adding an eighth category requires editing this skill, not silently extending it elsewhere. Full definitions and indicative contributing skills live in references/evidence-categories.md.
Declaration mechanic
Specialist skills add a
section to their
, between
and
:
markdown
## Evidence Produced
|----------|----------|--------|---------|
| Security | Threat model | Markdown doc per `skill-composition-standards/references/threat-model-template.md` | `docs/security/threat-model-checkout.md` |
| Operability | Runbook | Markdown doc per `skill-composition-standards/references/runbook-template.md` | `docs/runbooks/payment-failures.md` |
Rules
- A specialist skill that produces validation evidence MUST declare at least one row.
- Each row's value MUST be one of the seven canonical names (case-sensitive).
- Each row's field MUST reference an existing template or define its own format inline in the same .
- A specialist skill MAY contribute to multiple categories.
- Baseline skills, process skills, and pure index/orchestration skills MUST NOT declare.
Worked examples live in references/declaration-form.md.
Specialist vs exempt skills
A skill is "specialist" for the purposes of this contract when it:
- Produces concrete project artifacts (code patterns, schemas, configs, documents).
- Is loaded for a specific domain or platform problem rather than as a baseline frame.
Skills exempt from declaring (non-exhaustive):
- ,
skill-composition-standards
, itself.
system-architecture-design
, engineering-management-system
, git-collaboration-workflow
.
- , .
- All skills.
When a skill straddles the line, the default is declare. False positives are cheaper than silent omissions.
The Release Evidence Bundle
When a feature or release is ready to ship, the reviewer produces a single fillable document — the Release Evidence Bundle — that links to the concrete artifacts satisfying each of the seven categories.
- Template: references/release-evidence-bundle-template.md.
- semantics: permitted only with a reason on the same line. An empty cell is not acceptable.
- Risk tier guidance:
- Low risk — internal tools, docs, non-user-facing scripts. Typical bundle has 3-4 categories live.
- Medium risk — user-facing feature, single-tenant. All 7 categories addressed; some may be with reason.
- High risk — multi-tenant data, payments, auth, external APIs, AI features. All 7 categories live; no permitted on Correctness, Security, Data safety, Operability, or Release evidence.
Strictness
- The contract uses MUST, MAY, MUST NOT in the RFC 2119 sense.
- Mechanical enforcement is out of scope for this skill. It lands separately as a CI contract-gate hook that will:
- parse tables and warn on missing or invalid categories.
- parse Release Evidence Bundles and warn on empty cells or unjustified .
- Authoring with binding language now means the CI hook is a parser and CI integration only, not a policy debate.
Integration with existing skills
The rollout in references/integration-rollout.md lists every edit made to other skills when this skill was introduced. Future edits that touch this contract should update that file.
Companion Skills
skill-composition-standards
— Standards 1 and 2. Load this before .
- — repository production-readiness bar. This contract makes the evidence of meeting that bar a first-class artifact.
- Category-specific skills — the source of truth for how to validate within each category (see references/evidence-categories.md).