harness

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Harness — Long-Running Agent Framework

Harness — 长运行Agent框架

Executable protocol enabling any agent task to run continuously across multiple sessions with automatic progress recovery, task dependency resolution, failure rollback, and standardized error handling.
这是一个可执行协议,能让任意Agent任务跨多个会话持续运行,具备自动进度恢复、任务依赖解析、故障回滚和标准化错误处理能力。

Design Principles

设计原则

  1. Design for the agent, not the human — Test output, docs, and task structure are the agent's primary interface
  2. Progress files ARE the context — When context window resets, progress files + git history = full recovery
  3. Premature completion is the #1 failure mode — Structured task lists with explicit completion criteria prevent declaring victory early
  4. Standardize everything grep-able — ERROR on same line, structured timestamps, consistent prefixes
  5. Fast feedback loops — Pre-compute stats, run smoke tests before full validation
  6. Idempotent everything — Init scripts, task execution, environment setup must all be safe to re-run
  7. Fail safe, not fail silent — Every failure must have an explicit recovery strategy
  1. 为Agent而非人类设计 — 测试输出、文档和任务结构是Agent的主要交互界面
  2. 进度文件即上下文 — 当上下文窗口重置时,进度文件 + git历史记录 = 完整恢复
  3. 提前完成是首要失败模式 — 带有明确完成标准的结构化任务列表可防止过早宣告任务完成
  4. 标准化所有可被grep检索的内容 — 错误信息在同一行、结构化时间戳、一致的前缀
  5. 快速反馈循环 — 预先计算统计数据,在完整验证前运行冒烟测试
  6. 所有操作均需幂等 — 初始化脚本、任务执行、环境设置必须可安全重复运行
  7. 安全失败,而非静默失败 — 每一次失败都必须有明确的恢复策略

Commands

命令

/harness init <project-path>     # Initialize harness files in project
/harness run                     # Start/resume the infinite loop
/harness status                  # Show current progress and stats
/harness add "task description"  # Add a task to the list
/harness init <project-path>     # 在项目中初始化harness文件
/harness run                     # 启动/恢复无限循环
/harness status                  # 显示当前进度和统计数据
/harness add "task description"  # 向任务列表添加任务

Activation Marker

激活标记

Hooks only take effect when
.harness-active
marker file exists in the harness root (same directory as
harness-tasks.json
).
  • /harness init
    and
    /harness run
    MUST create this marker:
    touch <project-path>/.harness-active
  • When all tasks complete (no pending/in_progress/retryable left), remove it:
    rm <project-path>/.harness-active
  • Without this marker, all hooks are no-ops — they exit 0 immediately
只有当harness根目录(与
harness-tasks.json
同一目录)中存在
.harness-active
标记文件时,钩子才会生效。
  • /harness init
    /harness run
    必须创建此标记:
    touch <project-path>/.harness-active
  • 当所有任务完成(无pending/in_progress/retryable状态的任务)时,删除该标记:
    rm <project-path>/.harness-active
  • 若无此标记,所有钩子均为无操作——立即以0状态退出

Progress Persistence (Dual-File System)

进度持久化(双文件系统)

Maintain two files in the project working directory:
在项目工作目录中维护两个文件:

harness-progress.txt (Append-Only Log)

harness-progress.txt(仅追加日志)

Free-text log of all agent actions across sessions. Never truncate.
[2025-07-01T10:00:00Z] [SESSION-1] INIT Harness initialized for project /path/to/project
[2025-07-01T10:00:05Z] [SESSION-1] INIT Environment health check: PASS
[2025-07-01T10:00:10Z] [SESSION-1] LOCK acquired (pid=12345)
[2025-07-01T10:00:11Z] [SESSION-1] Starting [task-001] Implement user authentication (base=def5678)
[2025-07-01T10:05:00Z] [SESSION-1] CHECKPOINT [task-001] step=2/4 "auth routes created, tests pending"
[2025-07-01T10:15:30Z] [SESSION-1] Completed [task-001] (commit abc1234)
[2025-07-01T10:15:31Z] [SESSION-1] Starting [task-002] Add rate limiting (base=abc1234)
[2025-07-01T10:20:00Z] [SESSION-1] ERROR [task-002] [TASK_EXEC] Redis connection refused
[2025-07-01T10:20:01Z] [SESSION-1] ROLLBACK [task-002] git reset --hard abc1234
[2025-07-01T10:20:02Z] [SESSION-1] STATS tasks_total=5 completed=1 failed=1 pending=3 blocked=0 attempts_total=2 checkpoints=1
记录所有Agent跨会话操作的自由文本日志,永不截断。
[2025-07-01T10:00:00Z] [SESSION-1] INIT Harness initialized for project /path/to/project
[2025-07-01T10:00:05Z] [SESSION-1] INIT Environment health check: PASS
[2025-07-01T10:00:10Z] [SESSION-1] LOCK acquired (pid=12345)
[2025-07-01T10:00:11Z] [SESSION-1] Starting [task-001] Implement user authentication (base=def5678)
[2025-07-01T10:05:00Z] [SESSION-1] CHECKPOINT [task-001] step=2/4 "auth routes created, tests pending"
[2025-07-01T10:15:30Z] [SESSION-1] Completed [task-001] (commit abc1234)
[2025-07-01T10:15:31Z] [SESSION-1] Starting [task-002] Add rate limiting (base=abc1234)
[2025-07-01T10:20:00Z] [SESSION-1] ERROR [task-002] [TASK_EXEC] Redis connection refused
[2025-07-01T10:20:01Z] [SESSION-1] ROLLBACK [task-002] git reset --hard abc1234
[2025-07-01T10:20:02Z] [SESSION-1] STATS tasks_total=5 completed=1 failed=1 pending=3 blocked=0 attempts_total=2 checkpoints=1

harness-tasks.json (Structured State)

harness-tasks.json(结构化状态)

json
{
  "version": 2,
  "created": "2025-07-01T10:00:00Z",
  "session_config": {
    "concurrency_mode": "exclusive",
    "max_tasks_per_session": 20,
    "max_sessions": 50
  },
  "tasks": [
    {
      "id": "task-001",
      "title": "Implement user authentication",
      "status": "completed",
      "priority": "P0",
      "depends_on": [],
      "attempts": 1,
      "max_attempts": 3,
      "started_at_commit": "def5678",
      "validation": {
        "command": "npm test -- --testPathPattern=auth",
        "timeout_seconds": 300
      },
      "on_failure": {
        "cleanup": null
      },
      "error_log": [],
      "checkpoints": [],
      "completed_at": "2025-07-01T10:15:30Z"
    },
    {
      "id": "task-002",
      "title": "Add rate limiting",
      "status": "failed",
      "priority": "P1",
      "depends_on": [],
      "attempts": 1,
      "max_attempts": 3,
      "started_at_commit": "abc1234",
      "validation": {
        "command": "npm test -- --testPathPattern=rate-limit",
        "timeout_seconds": 120
      },
      "on_failure": {
        "cleanup": "docker compose down redis"
      },
      "error_log": ["[TASK_EXEC] Redis connection refused"],
      "checkpoints": [],
      "completed_at": null
    },
    {
      "id": "task-003",
      "title": "Add OAuth providers",
      "status": "pending",
      "priority": "P1",
      "depends_on": ["task-001"],
      "attempts": 0,
      "max_attempts": 3,
      "started_at_commit": null,
      "validation": {
        "command": "npm test -- --testPathPattern=oauth",
        "timeout_seconds": 180
      },
      "on_failure": {
        "cleanup": null
      },
      "error_log": [],
      "checkpoints": [],
      "completed_at": null
    }
  ],
  "session_count": 1,
  "last_session": "2025-07-01T10:20:02Z"
}
Task statuses:
pending
in_progress
(transient, set only during active execution) →
completed
or
failed
. A task found as
in_progress
at session start means the previous session was interrupted — handle via Context Window Recovery Protocol.
In concurrent mode (see Concurrency Control), tasks may also carry claim metadata:
claimed_by
and
lease_expires_at
(ISO timestamp).
Session boundary: A session starts when the agent begins executing the Session Start protocol and ends when a Stopping Condition is met or the context window resets. Each session gets a unique
SESSION-N
identifier (N =
session_count
after increment).
json
{
  "version": 2,
  "created": "2025-07-01T10:00:00Z",
  "session_config": {
    "concurrency_mode": "exclusive",
    "max_tasks_per_session": 20,
    "max_sessions": 50
  },
  "tasks": [
    {
      "id": "task-001",
      "title": "Implement user authentication",
      "status": "completed",
      "priority": "P0",
      "depends_on": [],
      "attempts": 1,
      "max_attempts": 3,
      "started_at_commit": "def5678",
      "validation": {
        "command": "npm test -- --testPathPattern=auth",
        "timeout_seconds": 300
      },
      "on_failure": {
        "cleanup": null
      },
      "error_log": [],
      "checkpoints": [],
      "completed_at": "2025-07-01T10:15:30Z"
    },
    {
      "id": "task-002",
      "title": "Add rate limiting",
      "status": "failed",
      "priority": "P1",
      "depends_on": [],
      "attempts": 1,
      "max_attempts": 3,
      "started_at_commit": "abc1234",
      "validation": {
        "command": "npm test -- --testPathPattern=rate-limit",
        "timeout_seconds": 120
      },
      "on_failure": {
        "cleanup": "docker compose down redis"
      },
      "error_log": ["[TASK_EXEC] Redis connection refused"],
      "checkpoints": [],
      "completed_at": null
    },
    {
      "id": "task-003",
      "title": "Add OAuth providers",
      "status": "pending",
      "priority": "P1",
      "depends_on": ["task-001"],
      "attempts": 0,
      "max_attempts": 3,
      "started_at_commit": null,
      "validation": {
        "command": "npm test -- --testPathPattern=oauth",
        "timeout_seconds": 180
      },
      "on_failure": {
        "cleanup": null
      },
      "error_log": [],
      "checkpoints": [],
      "completed_at": null
    }
  ],
  "session_count": 1,
  "last_session": "2025-07-01T10:20:02Z"
}
任务状态:
pending
in_progress
(临时状态,仅在活跃执行时设置)→
completed
failed
。会话启动时若发现任务状态为
in_progress
,则表示上一个会话被中断——需通过上下文窗口恢复协议处理。
在并发模式(见并发控制)下,任务可能还包含声明元数据:
claimed_by
lease_expires_at
(ISO时间戳)。
会话边界:当Agent开始执行会话启动协议时,会话开始;当满足停止条件或上下文窗口重置时,会话结束。每个会话都有唯一的
SESSION-N
标识符(N = 递增后的
session_count
)。

Concurrency Control

并发控制

Before modifying
harness-tasks.json
, acquire an exclusive lock using portable
mkdir
(atomic on all POSIX systems, works on both macOS and Linux):
bash
undefined
修改
harness-tasks.json
前,需使用可移植的
mkdir
获取排他锁(在所有POSIX系统上都是原子操作,适用于macOS和Linux):
bash
undefined

Acquire lock (fail fast if another agent is running)

获取锁(若已有Agent运行则快速失败)

Lock key must be stable even if invoked from a subdirectory.

即使从子目录调用,锁密钥也必须保持稳定。

ROOT="$PWD" SEARCH="$PWD" while [ "$SEARCH" != "/" ] && [ ! -f "$SEARCH/harness-tasks.json" ]; do SEARCH="$(dirname "$SEARCH")" done if [ -f "$SEARCH/harness-tasks.json" ]; then ROOT="$SEARCH" fi
PWD_HASH="$( printf '%s' "$ROOT" | (shasum -a 256 2>/dev/null || sha256sum 2>/dev/null) | awk '{print $1}' | cut -c1-16 )" LOCKDIR="/tmp/harness-${PWD_HASH:-unknown}.lock" if ! mkdir "$LOCKDIR" 2>/dev/null; then

Check if lock holder is still alive

LOCK_PID=$(cat "$LOCKDIR/pid" 2>/dev/null) if [ -n "$LOCK_PID" ] && kill -0 "$LOCK_PID" 2>/dev/null; then echo "ERROR: Another harness session is active (pid=$LOCK_PID)"; exit 1 fi

Stale lock — atomically reclaim via mv to avoid TOCTOU race

STALE="$LOCKDIR.stale.$$" if mv "$LOCKDIR" "$STALE" 2>/dev/null; then rm -rf "$STALE" mkdir "$LOCKDIR" || { echo "ERROR: Lock contention"; exit 1; } echo "WARN: Removed stale lock${LOCK_PID:+ from pid=$LOCK_PID}" else echo "ERROR: Another agent reclaimed the lock"; exit 1 fi fi echo "$$" > "$LOCKDIR/pid" trap 'rm -rf "$LOCKDIR"' EXIT

Log lock acquisition: `[timestamp] [SESSION-N] LOCK acquired (pid=<PID>)`
Log lock release: `[timestamp] [SESSION-N] LOCK released`

Modes:

- **Exclusive (default)**: hold the lock for the entire session (the `trap EXIT` handler releases it automatically). Any second session in the same state root fails fast.
- **Concurrent (opt-in via `session_config.concurrency_mode: "concurrent"`)**: treat this as a **state transaction lock**. Hold it only while reading/modifying/writing `harness-tasks.json` (including `.bak`/`.tmp`) and appending to `harness-progress.txt`. Release it immediately before doing real work.

Concurrent mode invariants:

- All workers MUST point at the same state root (the directory that contains `harness-tasks.json`). If you are using separate worktrees/clones, pin it explicitly (e.g., `HARNESS_STATE_ROOT=/abs/path/to/state-root`).
- Task selection is advisory; the real gate is **atomic claim** under the lock: set `status="in_progress"`, set `claimed_by` (stable worker id, e.g., `HARNESS_WORKER_ID`), set `lease_expires_at`. If claim fails (already `in_progress` with a valid lease), pick another eligible task and retry.
- Never run two workers in the same git working directory. Use separate worktrees/clones. Otherwise rollback (`git reset --hard` / `git clean -fd`) will destroy other workers.
ROOT="$PWD" SEARCH="$PWD" while [ "$SEARCH" != "/" ] && [ ! -f "$SEARCH/harness-tasks.json" ]; do SEARCH="$(dirname "$SEARCH")" done if [ -f "$SEARCH/harness-tasks.json" ]; then ROOT="$SEARCH" fi
PWD_HASH="$( printf '%s' "$ROOT" | (shasum -a 256 2>/dev/null || sha256sum 2>/dev/null) | awk '{print $1}' | cut -c1-16 )" LOCKDIR="/tmp/harness-${PWD_HASH:-unknown}.lock" if ! mkdir "$LOCKDIR" 2>/dev/null; then

检查锁持有者是否仍在运行

LOCK_PID=$(cat "$LOCKDIR/pid" 2>/dev/null) if [ -n "$LOCK_PID" ] && kill -0 "$LOCK_PID" 2>/dev/null; then echo "ERROR: Another harness session is active (pid=$LOCK_PID)"; exit 1 fi

过时锁——通过mv原子性回收,避免TOCTOU竞争

STALE="$LOCKDIR.stale.$$" if mv "$LOCKDIR" "$STALE" 2>/dev/null; then rm -rf "$STALE" mkdir "$LOCKDIR" || { echo "ERROR: Lock contention"; exit 1; } echo "WARN: Removed stale lock${LOCK_PID:+ from pid=$LOCK_PID}" else echo "ERROR: Another agent reclaimed the lock"; exit 1 fi fi echo "$$" > "$LOCKDIR/pid" trap 'rm -rf "$LOCKDIR"' EXIT

记录锁获取:`[timestamp] [SESSION-N] LOCK acquired (pid=<PID>)`
记录锁释放:`[timestamp] [SESSION-N] LOCK released`

模式:

- **排他模式(默认)**:在整个会话期间持有锁(`trap EXIT`处理程序会自动释放锁)。同一状态根目录下的第二个会话将快速失败。
- **并发模式(通过`session_config.concurrency_mode: "concurrent"`选择加入)**:将其视为**状态事务锁**。仅在读取/修改/写入`harness-tasks.json`(包括`.bak`/`.tmp`)和追加到`harness-progress.txt`时持有锁,完成实际工作后立即释放。

并发模式不变量:

- 所有Worker必须指向同一状态根目录(包含`harness-tasks.json`的目录)。若使用独立工作树/克隆,需显式指定(例如:`HARNESS_STATE_ROOT=/abs/path/to/state-root`)。
- 任务选择是建议性的;真正的 gate 是**锁下的原子声明**:设置`status="in_progress"`、设置`claimed_by`(稳定的Worker ID,例如`HARNESS_WORKER_ID`)、设置`lease_expires_at`。若声明失败(已处于`in_progress`状态且租约有效),则选择另一个符合条件的任务并重试。
- 切勿在同一git工作目录中运行两个Worker。请使用独立的工作树/克隆。否则,回滚操作(`git reset --hard` / `git clean -fd`)会破坏其他Worker的工作。

Infinite Loop Protocol

无限循环协议

Session Start (Execute Every Time)

会话启动(每次执行)

  1. Read state: Read last 200 lines of
    harness-progress.txt
    + full
    harness-tasks.json
    . If JSON is unparseable, see JSON corruption recovery in Error Handling.
  2. Read git: Run
    git log --oneline -20
    and
    git diff --stat
    to detect uncommitted work
  3. Acquire lock (mode-dependent): Exclusive mode fails if another session is active. Concurrent mode uses the lock only for state transactions.
  4. Recover interrupted tasks (see Context Window Recovery below)
  5. Health check: Run
    harness-init.sh
    if it exists
  6. Track session: Increment
    session_count
    in JSON. Check
    session_count
    against
    max_sessions
    — if reached, log STATS and STOP. Initialize per-session task counter to 0.
  7. Pick next task using Task Selection Algorithm below
  1. 读取状态:读取
    harness-progress.txt
    的最后200行 + 完整的
    harness-tasks.json
    。若JSON无法解析,请参阅错误处理中的JSON损坏恢复。
  2. 读取git信息:运行
    git log --oneline -20
    git diff --stat
    以检测未提交的工作
  3. 获取锁(取决于模式):排他模式下,若已有会话活跃则失败。并发模式下仅在状态事务中使用锁。
  4. 恢复中断的任务(见下文的上下文窗口恢复)
  5. 健康检查:若
    harness-init.sh
    存在则运行它
  6. 跟踪会话:在JSON中递增
    session_count
    。检查
    session_count
    是否达到
    max_sessions
    ——若已达到,则记录统计数据并停止。初始化每会话任务计数器为0。
  7. 使用任务选择算法挑选下一个任务

Task Selection Algorithm

任务选择算法

Before selecting, run dependency validation:
  1. Cycle detection: For each non-completed task, walk
    depends_on
    transitively. If any task appears in its own chain, mark it
    failed
    with
    [DEPENDENCY] Circular dependency detected: task-A -> task-B -> task-A
    . Self-references (
    depends_on
    includes own id) are also cycles.
  2. Blocked propagation: If a task's
    depends_on
    includes a task that is
    failed
    and will never be retried (either
    attempts >= max_attempts
    OR its
    error_log
    contains a
    [DEPENDENCY]
    entry), mark the blocked task as
    failed
    with
    [DEPENDENCY] Blocked by failed task-XXX
    . Repeat until no more tasks can be propagated.
Then pick the next task in this priority order:
  1. Tasks with
    status: "pending"
    where ALL
    depends_on
    tasks are
    completed
    — sorted by
    priority
    (P0 > P1 > P2), then by
    id
    (lowest first)
  2. Tasks with
    status: "failed"
    where
    attempts < max_attempts
    and ALL
    depends_on
    are
    completed
    — sorted by priority, then oldest failure first
  3. If no eligible tasks remain → log final STATS → STOP
选择任务前,先运行依赖验证:
  1. 循环检测:对于每个未完成的任务,递归遍历
    depends_on
    。若任务出现在自己的依赖链中,则将其标记为
    failed
    ,错误信息为
    [DEPENDENCY] Circular dependency detected: task-A -> task-B -> task-A
    。自引用(
    depends_on
    包含自身ID)也视为循环依赖。
  2. 阻塞传播:若任务的
    depends_on
    包含一个
    failed
    且不会再重试的任务(
    attempts >= max_attempts
    或其
    error_log
    包含
    [DEPENDENCY]
    条目),则将被阻塞的任务标记为
    failed
    ,错误信息为
    [DEPENDENCY] Blocked by failed task-XXX
    。重复此操作,直到没有更多任务可被传播标记。
然后按以下优先级顺序挑选下一个任务:
  1. 状态为
    pending
    且所有
    depends_on
    任务均为
    completed
    的任务 — 按
    priority
    排序(P0 > P1 > P2),然后按
    id
    排序(从小到大)
  2. 状态为
    failed
    attempts < max_attempts
    且所有
    depends_on
    任务均为
    completed
    的任务 — 按优先级排序,然后按失败时间从早到晚排序
  3. 若无符合条件的任务 → 记录最终统计数据 → 停止

Task Execution Cycle

任务执行周期

For each task, execute this exact sequence:
  1. Claim (atomic, under lock): Record
    started_at_commit
    = current HEAD hash. Set status to
    in_progress
    , set
    claimed_by
    , set
    lease_expires_at
    , log
    Starting [<task-id>] <title> (base=<hash>)
    . If the task is already claimed (
    in_progress
    with a valid lease), pick another eligible task and retry.
  2. Execute with checkpoints: Perform the work. After each significant step, log:
    [timestamp] [SESSION-N] CHECKPOINT [task-id] step=M/N "description of what was done"
    Also append to the task's
    checkpoints
    array:
    { "step": M, "total": N, "description": "...", "timestamp": "ISO" }
    . In concurrent mode, renew the lease at each checkpoint (push
    lease_expires_at
    forward).
  3. Validate: Run the task's
    validation.command
    with a timeout wrapper (prefer
    timeout
    ; on macOS use
    gtimeout
    from coreutils). If
    validation.command
    is empty/null, log
    ERROR [<task-id>] [CONFIG] Missing validation.command
    and STOP — do not declare completion without an objective check. Before running, verify the command exists (e.g.,
    command -v <binary>
    ) — if missing, treat as
    ENV_SETUP
    error.
    • Command exits 0 → PASS
    • Command exits non-zero → FAIL
    • Command exceeds timeout → TIMEOUT
  4. Record outcome:
    • Success: status=
      completed
      , set
      completed_at
      , log
      Completed [<task-id>] (commit <hash>)
      , git commit
    • Failure: increment
      attempts
      , append error to
      error_log
      . Verify
      started_at_commit
      exists via
      git cat-file -t <hash>
      — if missing, mark failed at max_attempts. Otherwise execute
      git reset --hard <started_at_commit>
      and
      git clean -fd
      to rollback ALL commits and remove untracked files. Execute
      on_failure.cleanup
      if defined. Log
      ERROR [<task-id>] [<category>] <message>
      . Set status=
      failed
      (Task Selection Algorithm pass 2 handles retries when attempts < max_attempts)
  5. Track: Increment per-session task counter. If
    max_tasks_per_session
    reached, log STATS and STOP.
  6. Continue: Immediately pick next task (zero idle time)
对于每个任务,严格执行以下序列:
  1. 声明任务(原子操作,在锁下执行):记录
    started_at_commit
    = 当前HEAD哈希。将状态设置为
    in_progress
    ,设置
    claimed_by
    ,设置
    lease_expires_at
    ,记录日志
    Starting [<task-id>] <title> (base=<hash>)
    。若任务已被声明(状态为
    in_progress
    且租约有效),则选择另一个符合条件的任务并重试。
  2. 带检查点执行:执行任务工作。完成每个重要步骤后,记录日志:
    [timestamp] [SESSION-N] CHECKPOINT [task-id] step=M/N "已完成工作的描述"
    同时将以下内容追加到任务的
    checkpoints
    数组:
    { "step": M, "total": N, "description": "...", "timestamp": "ISO" }
    。在并发模式下,每个检查点处续约(将
    lease_expires_at
    延后)。
  3. 验证:使用超时包装器运行任务的
    validation.command
    (优先使用
    timeout
    ;在macOS上使用coreutils中的
    gtimeout
    )。若
    validation.command
    为空/ null,则记录日志
    ERROR [<task-id>] [CONFIG] Missing validation.command
    并停止——无客观检查时不得宣告任务完成。运行前,验证命令是否存在(例如:
    command -v <binary>
    )——若不存在,则视为
    ENV_SETUP
    错误。
    • 命令返回0 → 通过
    • 命令返回非0 → 失败
    • 命令超时 → 超时
  4. 记录结果
    • 成功:状态设为
      completed
      ,设置
      completed_at
      ,记录日志
      Completed [<task-id>] (commit <hash>)
      ,执行git提交
    • 失败:递增
      attempts
      ,将错误追加到
      error_log
      。通过
      git cat-file -t <hash>
      验证
      started_at_commit
      是否存在——若不存在,则将任务标记为达到最大尝试次数的失败状态。否则,执行
      git reset --hard <started_at_commit>
      git clean -fd
      以回滚所有提交并删除未跟踪文件。若已定义
      on_failure.cleanup
      则执行它。记录日志
      ERROR [<task-id>] [<category>] <message>
      。将状态设为
      failed
      (任务选择算法的第二步会在尝试次数小于最大尝试次数时处理重试)
  5. 跟踪:递增每会话任务计数器。若达到
    max_tasks_per_session
    ,则记录统计数据并停止。
  6. 继续:立即挑选下一个任务(零空闲时间)

Stopping Conditions

停止条件

  • All tasks
    completed
  • All remaining tasks
    failed
    at max_attempts or blocked by failed dependencies
  • session_config.max_tasks_per_session
    reached for this session
  • session_config.max_sessions
    reached across all sessions
  • User interrupts
  • 所有任务均为
    completed
    状态
  • 所有剩余任务均已达到最大尝试次数失败,或被失败的依赖项阻塞
  • 达到当前会话的
    session_config.max_tasks_per_session
  • 达到所有会话的
    session_config.max_sessions
  • 用户中断

Context Window Recovery Protocol

上下文窗口恢复协议

When a new session starts and finds a task with
status: "in_progress"
:
  • Exclusive mode: treat this as an interrupted previous session and run the Recovery Protocol below.
  • Concurrent mode: only recover a task if either (a)
    claimed_by
    matches this worker, or (b)
    lease_expires_at
    is in the past (stale lease). Otherwise, treat it as owned by another worker and do not modify it.
  1. Check git state:
    bash
    git diff --stat          # Uncommitted changes?
    git log --oneline -5     # Recent commits since task started?
    git stash list           # Any stashed work?
  2. Check checkpoints: Read the task's
    checkpoints
    array to determine last completed step
  3. Decision matrix (verify recent commits belong to this task by checking commit messages for the task-id):
Uncommitted?Recent task commits?Checkpoints?Action
NoNoNoneMark
failed
with
[SESSION_TIMEOUT] No progress detected
, increment attempts
NoNoSomeVerify file state matches checkpoint claims. If files reflect checkpoint progress, resume from last step. If not, mark
failed
— work was lost
NoYesAnyRun
validation.command
. If passes → mark
completed
. If fails →
git reset --hard <started_at_commit>
, mark
failed
YesNoAnyRun validation WITH uncommitted changes present. If passes → commit, mark
completed
. If fails →
git reset --hard <started_at_commit>
+
git clean -fd
, mark
failed
YesYesAnyCommit uncommitted changes, run
validation.command
. If passes → mark
completed
. If fails →
git reset --hard <started_at_commit>
+
git clean -fd
, mark
failed
  1. Log recovery:
    [timestamp] [SESSION-N] RECOVERY [task-id] action="<action taken>" reason="<reason>"
新会话启动时若发现任务状态为
in_progress
  • 排他模式:将其视为中断的上一个会话,运行以下恢复协议。
  • 并发模式:仅当(a)
    claimed_by
    匹配当前Worker,或(b)
    lease_expires_at
    已过期(过时租约)时,才恢复任务。否则,将其视为其他Worker拥有的任务,不修改它。
  1. 检查git状态
    bash
    git diff --stat          # 是否有未提交的更改?
    git log --oneline -5     # 任务启动后的最近提交?
    git stash list           # 是否有暂存的工作?
  2. 检查检查点:读取任务的
    checkpoints
    数组以确定最后完成的步骤
  3. 决策矩阵(通过检查提交消息中的任务ID,验证最近提交是否属于当前任务):
有无未提交更改有无任务相关最近提交有无检查点操作
将任务标记为
failed
,错误信息为
[SESSION_TIMEOUT] No progress detected
,递增尝试次数
验证文件状态是否与检查点声明一致。若文件反映了检查点进度,则从最后一步恢复。若未反映,则标记为
failed
——工作已丢失
任意运行
validation.command
。若通过 → 标记为
completed
。若失败 →
git reset --hard <started_at_commit>
,标记为
failed
任意保留未提交更改运行验证。若通过 → 提交,标记为
completed
。若失败 →
git reset --hard <started_at_commit>
+
git clean -fd
,标记为
failed
任意提交未提交的更改,运行
validation.command
。若通过 → 标记为
completed
。若失败 →
git reset --hard <started_at_commit>
+
git clean -fd
,标记为
failed
  1. 记录恢复操作
    [timestamp] [SESSION-N] RECOVERY [task-id] action="<执行的操作>" reason="<原因>"

Error Handling & Recovery Strategies

错误处理与恢复策略

Each error category has a default recovery strategy:
CategoryDefault RecoveryAgent Action
ENV_SETUP
Re-run init, then STOP if still failingRun
harness-init.sh
again immediately. If fails twice, log and stop — environment is broken
CONFIG
STOP (requires human fix)Log the config error precisely (file + field), then STOP. Do not guess or auto-mutate task metadata
TASK_EXEC
Rollback via
git reset --hard <started_at_commit>
, retry
Verify
started_at_commit
exists (
git cat-file -t <hash>
). If missing, mark failed at max_attempts. Otherwise reset, run
on_failure.cleanup
if defined, retry if attempts < max_attempts
TEST_FAIL
Rollback via
git reset --hard <started_at_commit>
, retry
Reset to
started_at_commit
, analyze test output to identify fix, retry with targeted changes
TIMEOUT
Kill process, execute cleanup, retryWrap validation with
timeout <seconds> <command>
. On timeout, run
on_failure.cleanup
, retry (consider splitting task if repeated)
DEPENDENCY
Skip task, mark blockedLog which dependency failed, mark task as
failed
with dependency reason
SESSION_TIMEOUT
Use Context Window Recovery ProtocolNew session assesses partial progress via Recovery Protocol — may result in completion or failure depending on validation
JSON corruption: If
harness-tasks.json
cannot be parsed, check for
harness-tasks.json.bak
(written before each modification). If backup exists and is valid, restore from it. If no valid backup, log
ERROR [ENV_SETUP] harness-tasks.json corrupted and unrecoverable
and STOP — task metadata (validation commands, dependencies, cleanup) cannot be reconstructed from logs alone.
Backup protocol: Before every write to
harness-tasks.json
, copy the current file to
harness-tasks.json.bak
. Write updates atomically: write JSON to
harness-tasks.json.tmp
then
mv
it into place (readers should never see a partial file).
每个错误类别都有默认的恢复策略:
类别默认恢复策略Agent操作
ENV_SETUP
重新运行初始化,若仍失败则停止立即重新运行
harness-init.sh
。若连续失败两次,则记录日志并停止——环境已损坏
CONFIG
停止(需要人工修复)精确记录配置错误(文件 + 字段),然后停止。不得猜测或自动修改任务元数据
TASK_EXEC
通过
git reset --hard <started_at_commit>
回滚,重试
验证
started_at_commit
是否存在(
git cat-file -t <hash>
)。若不存在,则标记为达到最大尝试次数的失败状态。否则,执行回滚,若已定义则运行
on_failure.cleanup
,若尝试次数小于最大尝试次数则重试
TEST_FAIL
通过
git reset --hard <started_at_commit>
回滚,重试
回滚到
started_at_commit
,分析测试输出以确定修复方案,针对性修改后重试
TIMEOUT
终止进程,执行清理,重试使用
timeout <seconds> <command>
包装验证。超时后,运行
on_failure.cleanup
,重试(若重复超时,考虑拆分任务)
DEPENDENCY
跳过任务,标记为阻塞记录哪个依赖项失败,将任务标记为
failed
并注明依赖项原因
SESSION_TIMEOUT
使用上下文窗口恢复协议新会话通过恢复协议评估部分进度——根据验证结果,可能完成任务或标记为失败
JSON损坏:若
harness-tasks.json
无法解析,检查是否存在
harness-tasks.json.bak
(每次修改前都会生成)。若备份存在且有效,则从备份恢复。若无有效备份,则记录日志
ERROR [ENV_SETUP] harness-tasks.json corrupted and unrecoverable
并停止——仅从日志无法重建任务元数据(验证命令、依赖项、清理操作)。
备份协议:每次写入
harness-tasks.json
前,将当前文件复制到
harness-tasks.json.bak
。原子性写入更新:将JSON写入
harness-tasks.json.tmp
,然后
mv
到目标位置(读取者永远不会看到不完整的文件)。

Environment Initialization

环境初始化

If
harness-init.sh
exists in the project root, run it at every session start. The script must be idempotent.
Example
harness-init.sh
:
bash
#!/bin/bash
set -e
npm install 2>/dev/null || pip install -r requirements.txt 2>/dev/null || true
curl -sf http://localhost:5432 >/dev/null 2>&1 || echo "WARN: DB not reachable"
npm test -- --bail --silent 2>/dev/null || echo "WARN: Smoke test failed"
echo "Environment health check complete"
若项目根目录中存在
harness-init.sh
,则每次会话启动时都会运行它。该脚本必须是幂等的。
示例
harness-init.sh
bash
#!/bin/bash
set -e
npm install 2>/dev/null || pip install -r requirements.txt 2>/dev/null || true
curl -sf http://localhost:5432 >/dev/null 2>&1 || echo "WARN: DB not reachable"
npm test -- --bail --silent 2>/dev/null || echo "WARN: Smoke test failed"
echo "Environment health check complete"

Standardized Log Format

标准化日志格式

All log entries use grep-friendly format on a single line:
[ISO-timestamp] [SESSION-N] <TYPE> [task-id]? [category]? message
[task-id]
and
[category]
are included when applicable (task-scoped entries). Session-level entries (
INIT
,
LOCK
,
STATS
) omit them.
Types:
INIT
,
Starting
,
Completed
,
ERROR
,
CHECKPOINT
,
ROLLBACK
,
RECOVERY
,
STATS
,
LOCK
,
WARN
Error categories:
ENV_SETUP
,
CONFIG
,
TASK_EXEC
,
TEST_FAIL
,
TIMEOUT
,
DEPENDENCY
,
SESSION_TIMEOUT
Filtering:
bash
grep "ERROR" harness-progress.txt                    # All errors
grep "ERROR" harness-progress.txt | grep "TASK_EXEC" # Execution errors only
grep "SESSION-3" harness-progress.txt                # All session 3 activity
grep "STATS" harness-progress.txt                    # All session summaries
grep "CHECKPOINT" harness-progress.txt               # All checkpoints
grep "RECOVERY" harness-progress.txt                 # All recovery actions
所有日志条目均采用可被grep检索的单行格式:
[ISO时间戳] [SESSION-N] <类型> [任务ID]? [类别]? 消息
[任务ID]
[类别]
会在适用时包含(任务范围的条目)。会话级条目(
INIT
LOCK
STATS
)会省略它们。
类型:
INIT
,
Starting
,
Completed
,
ERROR
,
CHECKPOINT
,
ROLLBACK
,
RECOVERY
,
STATS
,
LOCK
,
WARN
错误类别:
ENV_SETUP
,
CONFIG
,
TASK_EXEC
,
TEST_FAIL
,
TIMEOUT
,
DEPENDENCY
,
SESSION_TIMEOUT
过滤示例:
bash
grep "ERROR" harness-progress.txt                    # 所有错误
grep "ERROR" harness-progress.txt | grep "TASK_EXEC" # 仅执行错误
grep "SESSION-3" harness-progress.txt                # 会话3的所有活动
grep "STATS" harness-progress.txt                    # 所有会话摘要
grep "CHECKPOINT" harness-progress.txt               # 所有检查点
grep "RECOVERY" harness-progress.txt                 # 所有恢复操作

Session Statistics

会话统计数据

At session end, update
harness-tasks.json
: set
last_session
to current timestamp. (Do NOT increment
session_count
here — it is incremented at Session Start.) Then append:
[timestamp] [SESSION-N] STATS tasks_total=10 completed=7 failed=1 pending=2 blocked=0 attempts_total=12 checkpoints=23
blocked
is computed at stats time: count of pending tasks whose
depends_on
includes a permanently failed task. It is not a stored status value.
会话结束时,更新
harness-tasks.json
:将
last_session
设置为当前时间戳。(请勿在此处递增
session_count
——它会在会话启动时递增。)然后追加日志:
[timestamp] [SESSION-N] STATS tasks_total=10 completed=7 failed=1 pending=2 blocked=0 attempts_total=12 checkpoints=23
blocked
是在统计时计算的:依赖于永久失败任务的pending任务数量。它不是存储的状态值。

Init Command (
/harness init
)

初始化命令(
/harness init

  1. Create
    harness-progress.txt
    with initialization entry
  2. Create
    harness-tasks.json
    with empty task list and default
    session_config
  3. Optionally create
    harness-init.sh
    template (chmod +x)
  4. Ask user: add harness files to
    .gitignore
    ?
  1. 创建
    harness-progress.txt
    并写入初始化条目
  2. 创建包含空任务列表和默认
    session_config
    harness-tasks.json
  3. 可选:创建
    harness-init.sh
    模板(设置chmod +x权限)
  4. 询问用户:是否将harness文件添加到
    .gitignore

Status Command (
/harness status
)

状态命令(
/harness status

Read
harness-tasks.json
and
harness-progress.txt
, then display:
  1. Task summary: count by status (completed, failed, pending, blocked).
    blocked
    = pending tasks whose
    depends_on
    includes a permanently failed task (computed, not a stored status).
  2. Per-task one-liner:
    [status] task-id: title (attempts/max_attempts)
  3. Last 5 lines from
    harness-progress.txt
  4. Session count and last session timestamp
Does NOT acquire the lock (read-only operation).
读取
harness-tasks.json
harness-progress.txt
,然后显示:
  1. 任务摘要:按状态统计数量(completed、failed、pending、blocked)。
    blocked
    = 依赖于永久失败任务的pending任务数量(计算得出,而非存储的状态)。
  2. 每个任务的单行信息:
    [状态] 任务ID: 标题 (尝试次数/最大尝试次数)
  3. harness-progress.txt
    的最后5行
  4. 会话数量和最后一次会话的时间戳
无需获取锁(只读操作)。

Add Command (
/harness add
)

添加命令(
/harness add

Append a new task to
harness-tasks.json
with auto-incremented id (
task-NNN
), status
pending
, default
max_attempts: 3
, empty
depends_on
, and no validation command (required before the task can be completed). Prompt user for optional fields:
priority
,
depends_on
,
validation.command
,
timeout_seconds
. Requires lock acquisition (modifies JSON).
harness-tasks.json
追加新任务,自动递增ID(
task-NNN
),状态为
pending
,默认
max_attempts: 3
,空
depends_on
,无验证命令(任务完成前需设置)。提示用户输入可选字段:
priority
depends_on
validation.command
timeout_seconds
。需要获取锁(修改JSON)。

Tool Dependencies

工具依赖

Requires: Bash, file read/write, git. All harness operations must be executed from the project root directory. Does NOT require: specific MCP servers, programming languages, or test frameworks.
Concurrent mode requires isolated working directories (
git worktree
or separate clones). Do not run concurrent workers in the same working tree.
需要:Bash、文件读写权限、git。所有harness操作必须从项目根目录执行。 不需要:特定的MCP服务器、编程语言或测试框架。
并发模式需要独立的工作目录(
git worktree
或独立克隆)。请勿在同一工作目录中运行并发Worker。