The BAZDMEG Method

BAZDMEG方法

Eight principles for AI-assisted development. Born from pain. Tested in production.

AI辅助开发的八项原则。源于实践痛点，已通过生产环境验证。

Quick Reference

快速参考

#	Principle	One-Liner	Deep Dive
1	Requirements Are The Product	The code is just the output	references/01-requirements.md
2	Discipline Before Automation	You cannot automate chaos	references/02-discipline.md
3	Context Is Architecture	What the model knows when you ask	references/03-context.md
4	Test The Lies	Unit tests, E2E tests, agent-based tests	references/04-testing.md
5	Orchestrate, Do Not Operate	Coordinate agents, not keystrokes	references/05-orchestration.md
6	Trust Is Earned In PRs	Not in promises, not in demos	references/06-trust.md
7	Own What You Ship	If you cannot explain it at 3am, do not ship it	references/07-ownership.md
8	Sources Have Rank	Canonical spec > audit > chat	references/08-sources-have-rank.md

序号	原则	一句话总结	深度解析
1	需求即产品	代码只是输出结果	references/01-requirements.md
2	先规范后自动化	混乱的流程无法自动化	references/02-discipline.md
3	上下文即架构	提问时模型掌握的信息	references/03-context.md
4	测试谎言	单元测试、E2E测试、Agent驱动测试	references/04-testing.md
5	编排而非操作	协调Agent，而非手动敲击键盘	references/05-orchestration.md
6	信任在PR中建立	不在承诺或演示中，而在PR里	references/06-trust.md
7	对交付内容负责	若凌晨3点无法解释，就不要交付	references/07-ownership.md
8	源信息有优先级	规范文档 > 审计记录 > 聊天记录	references/08-sources-have-rank.md

Effort Split

工作精力分配

Activity	Time	Why
Planning	30%	Understanding the problem, planning interview, verifying understanding
Testing	50%	Writing tests, running agent-based tests, verifying everything works
Quality	20%	Edge cases, maintainability, polish
Coding	~0%	AI writes the code; you make sure the code is right

活动	占比	原因
规划	30%	理解问题、开展规划访谈、验证认知
测试	50%	编写测试、运行Agent驱动测试、验证功能正常
质量保障	20%	边缘场景、可维护性、细节打磨
编码	~0%	AI负责编写代码；你负责确保代码正确

Workflow: Planning Interview

工作流：规划访谈

Run this interview BEFORE any code is written. The agent asks the developer these questions and does not proceed until all are answered.

What problem are we solving? -- State the problem in your own words, not the ticket's words.
What data already exists? -- What is the server-side source of truth? What APIs exist? What state is already managed?
What is the user flow? -- Walk through every step the user takes, including edge cases and error states.
What should NOT change? -- Identify existing behavior, contracts, or interfaces that must be preserved.
What happens on failure? -- Network errors, invalid input, race conditions, missing data.
How will we verify it works? -- Name the specific tests: unit, E2E, agent-based. What constitutes "done"?
Can I explain this to a teammate? -- If you cannot explain the approach to someone else, stop and learn more.

Stopping rules:

If any answer is "I don't know" -- stop and research before proceeding.
If the developer defers to "the AI will figure it out" -- stop. The requirement IS the product.
If no test plan exists -- stop. Untested code is unshippable code.

在编写任何代码前开展此访谈。Agent会向开发者提出以下问题，所有问题得到回答后才会继续。

我们要解决什么问题？ —— 用你自己的话描述问题，不要直接复制工单内容。
已有哪些数据？ —— 服务端的可信数据源是什么？已存在哪些API？已管理的状态有哪些？
用户流程是怎样的？ —— 梳理用户的每一步操作，包括边缘场景和错误状态。
哪些内容不能改动？ —— 识别必须保留的现有行为、契约或接口。
失败时会发生什么？ —— 网络错误、无效输入、竞态条件、数据缺失等场景。
我们如何验证功能正常？ —— 明确具体的测试类型：单元测试、E2E测试、Agent驱动测试。什么才算“完成”？
我能向同事解释清楚吗？ —— 若无法向他人解释实现思路，请暂停并加深理解。

终止规则：

若任何问题的答案是“我不知道”——暂停并调研清楚后再继续。
若开发者寄希望于“AI会搞定”——暂停。需求本身就是产品的核心。
若无测试计划——暂停。未测试的代码不可交付。

Checkpoint 0: Session Bootstrap

检查点0：会话初始化

Run BEFORE anything else in a new session.

Read project status doc (STATUS_WALKTHROUGH, README, or equivalent)
Check task list / mailbox for pending work
Confirm current branch, latest commit, CI status
Identify what changed since last session (git log, diff)
Read agent-specific notes file (if multi-agent)
Execute the "NOW" section — do not ask questions first

If you cannot confirm the current state, stop. You are operating on stale context.

在新会话的所有操作前执行。

阅读项目状态文档（STATUS_WALKTHROUGH、README或同类文档）
查看任务列表/收件箱中的待办工作
确认当前分支、最新提交、CI状态
识别自上次会话以来的变更（git log、diff）
阅读Agent专属笔记文件（多Agent场景下）
执行“NOW”部分内容——不要先提问

若无法确认当前状态，请暂停。你正在基于过期上下文操作。

Checkpoint 1: Pre-Code Checklist

检查点1：编码前检查清单

Run this BEFORE the AI writes any code.

Can I explain the problem in my own words?
Has the AI interviewed me about the requirements?
Do I understand why the current code exists?
Have I checked my documentation for relevant context?
Is my CLAUDE.md current?
Are my tests green and non-flaky?
Is CI running in under 10 minutes?

If any box is unchecked, do not proceed to implementation.

在AI编写任何代码前执行。

我能用自己的话解释问题吗？
AI是否已针对需求对我进行访谈？
我理解现有代码存在的原因吗？
我是否查阅了相关上下文的文档？
我的CLAUDE.md是否是最新的？
我的测试是否全部通过且无不稳定情况？
CI运行时间是否在10分钟以内？

若有任何一项未勾选，请勿进入编码阶段。

Checkpoint 2: Post-Code Checklist

检查点2：编码后检查清单

Run this AFTER the AI writes code, BEFORE creating a PR.

Can I explain every line to a teammate?
Have I verified the AI's assumptions against the architecture?
Do I know why the AI chose this approach over alternatives?
Have the agents tested it like a human would?
Do MCP tool tests cover the business logic at 100%?

If any box is unchecked, go back and understand before proceeding.

在AI完成代码编写后、创建PR前执行。

我能向同事解释每一行代码吗？
我是否已对照架构验证AI的假设？
我知道AI选择此方案而非其他方案的原因吗？
Agents是否已像人类一样测试过代码？
MCP工具测试是否100%覆盖业务逻辑？

若有任何一项未勾选，请返回并加深理解后再继续。

Checkpoint 3: Pre-PR Checklist

检查点3：PR提交前检查清单

Run this BEFORE submitting the pull request.

Do my unit tests prove the code works?
Do my E2E tests prove the feature works?
Does TypeScript pass with no errors in strict mode?
Can I answer "why" for every decision in the diff?
Would I be comfortable debugging this at 3am?
Does the PR description explain the thinking, not just the change?

If any answer is "no," stop. Go back. Learn more.

在提交拉取请求前执行。

我的单元测试能证明代码可用吗？
我的E2E测试能证明功能正常吗？
TypeScript严格模式下是否无错误？
我能回答diff中每一项决策的“原因”吗？
我能从容地在凌晨3点调试这段代码吗？
PR描述是否解释了思路，而非仅说明变更内容？

若任何问题的答案为“否”，请暂停。返回去加深理解。

Automation-Ready Audit

自动化就绪审核

Before adding AI agents to a workflow, verify these six gates pass.

Gate	Requirement	Current (Feb 2026)	Why
CI Speed	Under 10 min (under 10s = branchless)	~3 min tests, build OOM intermittent	Fast CI = fast agent iterations. If CI completes in under 10 seconds, skip branches entirely — commit to main (trunk-based dev).
Flaky Tests	Zero	Zero known	Flaky tests gaslight the AI into chasing phantom bugs
Coverage	100% on business logic	80% lines (CI-enforced), 96% MCP file coverage (94/98)	Untested code is invisible to agents; they will refactor through it
TypeScript	Strict mode enabled	Strict, zero `any` , zero `eslint-disable`	Claude Code integrates with the TS Language Server; strict mode = level zero
CLAUDE.md	Current and complete	Updated Feb 16, 2026 (7 pkgs, 98 tools, ~170 routes)	Stops the AI from guessing; it follows the playbook instead
Domain Gates	Project-specific executable quality gates exist	(project-specific)	Generic checklists miss domain invariants; executable gates catch what generic gates cannot

See references/02-discipline.md for the full breakdown.

在将AI Agents加入工作流前，需验证以下六项关卡全部通过。

关卡	要求	当前状态（2026年2月）	原因
CI速度	10分钟以内（10秒以内可采用无分支模式）	测试约3分钟，构建偶发内存不足	快速CI = Agent快速迭代。若CI在10秒内完成，可完全跳过分支——直接提交到主分支（主干开发）。
不稳定测试	零不稳定测试	无已知不稳定测试	不稳定测试会误导AI去追查不存在的Bug
测试覆盖率	业务逻辑覆盖率100%	行覆盖率80%（CI强制要求），MCP文件覆盖率96%（94/98）	未测试的代码对Agent不可见；它们会无意识地重构这些代码
TypeScript	启用严格模式	严格模式，零 `any` 类型，零 `eslint-disable`	Claude Code与TS语言服务集成；严格模式是基础要求
CLAUDE.md	内容最新且完整	2026年2月16日更新（7个包，98个工具，约170个路由）	避免AI猜测；让它遵循预设规则
领域关卡	存在项目专属的可执行质量关卡	（项目专属）	通用检查清单会遗漏领域不变量；可执行关卡能捕捉通用关卡无法发现的问题

完整说明请查看references/02-discipline.md。

The 10-Second Rule: Trunk-Based Development

10秒规则：主干开发

If your CI pipeline (lint + typecheck + tests on changed files) consistently completes in under 10 seconds:

Skip feature branches. Commit directly to main. The feedback loop is fast enough that broken commits are caught and fixed immediately.

This is trunk-based development — the same pattern used by Google, Meta, and other high-velocity engineering orgs. Prerequisites:

Fast, reliable CI (under 10 seconds for affected tests)
Zero flaky tests
```
vitest --changed COMMIT_HASH
```
to only run affected tests
```
file_guard
```
MCP tool to pre-validate changes

When to still use branches:

CI takes more than 10 seconds
Multiple agents work on the same codebase simultaneously
Regulatory/compliance requirements mandate review

The math: 50 commits/day at 5s CI each = 4 minutes waiting. Branching overhead at 5 min/change = 250 minutes of ceremony. The choice is clear.

若你的CI流水线（仅针对变更文件的lint + 类型检查 + 测试）持续在10秒内完成：

**跳过功能分支。**直接提交到主分支。反馈循环足够快，可立即发现并修复提交错误。

这就是主干开发——谷歌、Meta等高速工程组织采用的模式。前提条件：

快速可靠的CI（受影响测试在10秒内完成）
零不稳定测试
使用
```
vitest --changed COMMIT_HASH
```
仅运行受影响的测试
使用
```
file_guard
```
MCP工具预先验证变更

仍需使用分支的场景：

CI运行时间超过10秒
多个Agent同时处理同一代码库
法规/合规要求强制评审

数据对比： 每日50次提交，每次CI耗时5秒 = 总计等待4分钟。每次变更的分支管理开销5分钟 = 总计250分钟的流程成本。选择显而易见。

Hourglass Testing Model

沙漏测试模型

         +---------------------+
         |   E2E Specs (heavy)  |  <-- Humans write these
         |   User flows as       |
         |   Given/When/Then     |
         +----------+-----------+
                    |
            +-------v-------+
            |  UI Code       |  <-- AI generates this
            |  (thin,        |    Disposable.
            |   disposable)  |    Regenerate, don't fix.
            +-------+-------+
                    |
    +---------------v---------------+
    |  Business Logic Tests (heavy)  |  <-- MCP tools + unit tests
    |  Validation, contracts, state   |    Bulletproof.
    |  transitions, edge cases        |    Never skip.
    +-------------------------------+

Layer	Share	What to test
MCP tool tests	70%	Business logic, validation, contracts, state transitions
E2E specs	20%	Full user flows (Given/When/Then), wiring verification only
UI component tests	10%	Accessibility, responsive layout, keyboard navigation

See references/04-testing.md for the Three Lies Framework and test type decision guide.

         +---------------------+
         |   E2E 用例（重量级）  |  <-- 由人类编写
         |   基于 Given/When/Then 的用户流程 |
         +----------+-----------+
                    |
            +-------v-------+
            |  UI 代码       |  <-- 由AI生成
            | （轻量、可丢弃） |    可重新生成，无需修复。
            +-------+-------+
                    |
    +---------------v---------------+
    |  业务逻辑测试（重量级）  |  <-- MCP工具 + 单元测试
    |  验证、契约、状态转换、边缘场景 |    绝对可靠。
    |                              |    绝不可跳过。
    +-------------------------------+

层级	占比	测试内容
MCP工具测试	70%	业务逻辑、验证、契约、状态转换
E2E用例	20%	完整用户流程（Given/When/Then），仅验证衔接逻辑
UI组件测试	10%	可访问性、响应式布局、键盘导航

关于“三谎言框架”和测试类型决策指南，请查看references/04-testing.md。

Bayesian Bugbook

贝叶斯Bug手册

Bug appears twice → mandatory Bugbook entry. Bugs earn their record through recurrence and lose it through irrelevance.

Event	Confidence	Status
First observed	0.5	CANDIDATE — log conditions
Second occurrence	0.6+	ACTIVE — full entry + regression test
Fix prevents recurrence	+0.1 per prevention	ACTIVE — confidence grows
Irrelevant 5+ sessions	-0.1 decay	Decaying → DEPRECATED below 0.3

Every ACTIVE entry requires a regression test matched to scope: unit test for single-function bugs, E2E test for cross-component bugs, agent-based test for usability bugs.

See references/04-testing.md for the full Bugbook entry format and Three Lies integration.

Bug出现两次 → 必须录入Bug手册。 Bug需通过重复出现获得记录资格，因无关性失去记录资格。

事件	置信度	状态
首次发现	0.5	候选状态 —— 记录出现条件
第二次出现	0.6+	活跃状态 —— 完整记录 + 回归测试
修复阻止重复出现	每次阻止+0.1	活跃状态 —— 置信度提升
连续5+会话未出现	每次衰减-0.1	衰减中 → 置信度低于0.3时标记为已废弃

每个活跃条目都需要匹配范围的回归测试：单函数Bug对应单元测试，跨组件Bug对应E2E测试，可用性Bug对应Agent驱动测试。

完整的Bug手册条目格式和与三谎言框架的集成，请查看references/04-testing.md。

bazdmeg

Original

Translation